The SAMATE Reference Dataset and Target Practice Test Suite |
The SAMATE Reference Dataset (SRD) is a rapidly growing set of contributed test cases for measuring software assurance (SwA) tool capability against a functional specification for that tool. This initial distribution is a compilation of C source code test cases that will be used for evaluating the functional capability of C source code scanning tools. Contributions from MIT Lincoln Lab and Fortify Software Inc. make up this initial set. Additional contributions from Klocwork Inc. and Ounce Labs Inc. will be added soon. We expect to expand the SRD to include other languages (e.g. C++, Java) as well as to include test suites for other SwA tools (such as requirements and software design documents).
Documentation for each test case is contained in the source files themselves. In the case of the MIT contribution, the first line of each test case contains a classification code describing the test case “signature” (in terms of code complexity). All MIT discrete test cases are “buffer overflow” examples, with permutations of some of the 22 coding variation factors to challenge a tool's ability to discover a buffer overflow or recognize a patched version of the overflow. Also, MIT contributed 14 models (scaled-down versions) of 3 real world applications (bind, sendmail, and wu-ftpd).
Fortify Software has contributed C code test cases, the majority of which are also buffer overflow vulnerabilities. Additionally a number of race condition, command injection and other vulnerabilities are also included in the test suite. Like the MIT test cases, the Fortify test cases are “self documenting”, with keywoSRD describing the type of software flaw present in the code. Additionally, to provide a uniform way of classifying the complexity of the test cases, the MIT classification code is placed at the top of each test file.
Klocwork Inc. has donated an initial contribution of C++ test cases, the majority of which are memory management realated (e.g. memory leak, bad frees, use after frees ). They intend to follow up with an additional donation of Java test cases.
A subset of both the MIT (152 discrete test cases and 3 models) and Fortify (12) test cases make up the “target practice” test suite. A representative group of well-understood and documented tests are presented as a “starting point” to get initial feedback from tool developers and users as to how useful the test suite is. Both a “bad” (flawed) and “good” (patched) version exists for each test case.
Confidentiality of Test Results - At no time is a tool developer required to report anything about their tool's performance against the Target Practice test suite. The purpose of the target practice is to solicit feedback on the SRD… NOT the tools that run against it. If a tool developer wishes to provide further insight into the usefulness of the SRD by disclosing how their tool performed against it, they do so at their own discretion.
9 AM - 11:30 AM - Discussion of Test Results and Reference Dataset by target practice particpants and workshop attendees :
|
Disclaimer: Any commercial product mentioned is for information only; it does not imply recommendation or endorsement by NIST nor does it imply that the products mentioned are necessarily the best available for the purpose. |