SRD Manual: Test Case Status - What it Means for Test Cases You Download
From SAMATE
This page discusses some of the design issues and decisions of the SAMATE Reference Dataset (SRD). The SRD user interface and its manual are on-line. It evolves quickly. We appreciate and acknowledge those who contributed test cases to the SRD.
Contents[hide] |
Introduction
Test cases in the SRD have a different "status", as evidenced by the C (candidate), A (accepted) or D (deprecated) tag assigned to each test case when viewed searching the SRD using its web interface. The purpose of giving SRD test cases different a status type is to provide the test case user with an indicator of the case's quality, both in terms of test case documentation and test case construction. Each status tag is described in more detail below.
Candidate Test Cases
When a test case is first uploaded to the SAMATE SRD, it is assigned a status of "candidate". This means that the test case has not been reviewed by the SAMATE SRD librarian to determine if it is complete in its documentation, correct in its construction and acceptable in quality. Until a test case is examined in this way, it will keep its status of "candidate".
Acccepted Test Cases
An "accepted" test case is one that meets the necessary documentation, correctness and quality requirements that permit an SRD user to test a tool against a particular source code weakness. In order to meet those requirements, the test must be thorough in its documentation, correct in representing a particular weakness in the source code, and of high enough quality that it is simple to understand and free of extraneous weaknesses that confuse a user of the intent of the test case. If you download an test case that has a status of "accepted", you can expect the following:
Test Case Documentation Will Contain:
- The test case description describing the purpose of the test case.
- The test case author's name.
- A good (false alarm) or bad (true positive) test case indicator.
- If this test is paired with another test(e.g. a bad/good test case pair), the directory and filename of the associated test case will be provided.
- If the test case is submitted by someone other than the author, the contributor's name will provided.
- The test case type will be provided. Possible values include "Source Code", "Binary" or "Pseudo Code".
- If the test case is source code, its language (e.g. C, C++, Java) will be provided.
- If necessary to compile, analyze or execute the test, instructions will be included. This may include compiler name/version, compiler directives, environment variable definitions, execution instructions or other test context information.
- Whether the test is classified as "good" (false alarm) or "bad" (true positive), the weakness name (e.g. CWE ID) associated with this text case will be provided.
- If this is a "bad" test, the file name and (if appropriate for the weakness type) line number containing the flaw will be provided.
- If the flaw in the test case is a CWE "chain" or a "composite" type, contributing CWEs (and their corresponding file/line number) will be listed in the flaw documentation.
- If code complexity (such as loops, inter-procedural data flow, buffer aliasing) is part of the test case , the code complexity name will be included in the test case documentation.
- A good (false alarm) or bad (true positive) test case indicator will be provided to inform the user what kind of test it is.
Source Code Will:
- Compile (for compilable languages) or run without fatal errors (for complete interpreted languages).
- Run without fatal error messages other than those expected for an incomplete program (if it is just functions or modules).
- Not generate any warnings (unless the warnings are expected as part of the test).
- Contain the documented weakness the test case is a "bad" (true positive) test case.
- Contain no weaknesses at all if the test is a "good" (false alarm) test case
Deprecated Test Cases
Who can change test cases or test suites? When? Why?
To have long term value, the content of a test case is "write once". That is, once source code is added to the SRD, it keeps the same name and never changes. This permanence allows research work to refer to, say SRD test case 1552, knowing that that exact code can always be retrieved. Later work could reliably get exactly what was used before.
What if there is a mistake in the code, for instance, there is a second, unintended weakness? It would be marked with a status of "deprecated", and a new "correct" version submitted to the SRD. Deprecated test cases should not be used for new work. They remain in the SRD with their original identifiers as a reference to redo previous work.
Test suites are similarly "write once". Once they are designated, they should not change. A test suite might be superceded by an improved test suite, which refers to test cases conforming to the latest language standard or has better coverage.
