The SAMATE Project Department of Homeland Security

SARD Manual: What the Test Case Status Means

From SAMATE


This page discusses some of the design issues and decisions of the Software Assurance Reference Dataset (SARD). The SARD user interface and its manual are on-line. It evolves quickly. We appreciate and acknowledge those who contributed test cases to the SARD.

Contents

[hide]

Introduction

Test cases in the SARD have a different "status", as evidenced by the C (candidate), A (accepted) or D (deprecated) tag assigned to each test case when viewed searching the SARD using its web interface. The purpose of giving SARD test cases different a status type is to provide the test case user with an indicator of the case's quality, both in terms of test case documentation and test case construction. Each status tag is described in more detail below.

Candidate Test Cases

When a test case is first added to the SARD, it is assigned a status of "candidate". This means that the test case has not been reviewed by the SARD librarian to determine if it is complete in its documentation, correct in its construction and acceptable in quality. Until a test case is examined in this way, it will keep its status of "candidate".

Acccepted Test Cases

An "accepted" test case is one that we believe meets the necessary documentation, correctness and quality requirements that permit an SARD user to test a tool against a particular source code weakness. In order to meet those requirements, the test must be thorough in its documentation, correct in representing a particular weakness in the source code, and of high enough quality that it is simple to understand and free of extraneous weaknesses that confuse a user of the intent of the test case. If you download a test case that has a status of "accepted", you can expect the following:

Test Case Documentation Will Contain:

  • A description describing the purpose of the test case.
  • The author's name.
  • A good (false alarm), bad (true positive), or mixed (some false alarm code and some true positive code) test case indicator.
  • If this test is paired with another test(e.g. a bad/good test case pair), the directory and filename of the associated test case will be provided.
  • If the test case is submitted by someone other than the author, the contributor's name will provided.
  • The test case type will be provided. Possible values include "Source Code", "Binary" or "Pseudo Code".
  • If the test case is source code, its language (e.g. C, C++, or Java) is provided.
  • If necessary to compile, analyze or execute the test, instructions will be included. This may include compiler name/version, compiler directives, environment variable definitions, execution instructions or other test context information.
  • Whether the test is classified as "good", "bad", or "mixed, the Common Weakness Enumeration (CWE) ID and name is provided.
  • If this is a "bad" or "mixed" test, the file name and (if appropriate for the weakness type) line number containing the flaw will be provided.
  • If the flaw in the test case is a CWE "chain" or a "composite" type, contributing CWEs (and their corresponding file/line number) will be listed in the flaw documentation.
  • If code complexity (such as loops, inter-procedural data flow, buffer aliasing) is part of the test case , the code complexity name will be included in the test case documentation.

Source Code Will:

  • Compile (for compilable languages) or run without fatal errors (for complete interpreted languages).
  • Run without fatal error messages other than those expected for an incomplete program (if it is just functions or modules).
  • Not generate any warnings (unless the warnings are expected as part of the test).
  • Contain the documented weakness if the test case is a bad or mixed test case.
  • Contain no weaknesses at all if the test is a good test case.

Deprecated Test Cases

Who can change test cases or test suites? When? Why?

To have long term value, the content of a test case is "write once". That is, once source code is added to the SARD, it keeps the same test case ID and never changes. This permanence allows research work to refer to, say SARD test case 1552, knowing that that exact code can always be retrieved. Later work could reliably get exactly what was used before.

What if there is a mistake in the code, for instance, there is a second, unintended weakness? It would be marked with a status of "deprecated", and a new "correct" version submitted to the SARD. Deprecated test cases should not be used for new work. They remain in the SARD with their original identifiers as a reference to redo previous work.

Test suites are similarly "write once". Once they are designated, they should not change. A test suite might be superceded by an improved test suite, which refers to test cases conforming to the latest language standard or has better coverage.


Views