The SAMATE Project

A Rodeo for Production Software Verification Tools that are based on Formal Methods

This is currently a proposal.

The goal is to motivate progess by
measuring the efficacy of tools using
formal methods
algorithm and techniques
for software verification
of real-world software.

We will focus on tools that handle entire pieces of software (of production size and complexity), not just particular approaches or problems.

Our approach is to

  1. Connect with the community of tool developers and tool users,
  2. Develop tools to measure code complexity quantities of production software, using them to
  3. Select a suite of tests (and accompanying specs),
  4. Run tools against them in a competition or rodeo, and
  5. Assess findings to quantify improvement and identify research opportunities.
  6. This requires establishing a software laboratory.

We presented the Rodeo proposal at TOOLympics at TACAS'19 in Prague, April 2019.

White Paper: What Formal Methods Can Help Assure Statistical Software?

A simplistic pseudorandom number generator may offer sufficient randomness for a video game, but may not be sufficient for valid Monte Carlo simulation. Such problems are hard to find through traditional testing. Formal methods can complement testing to gain greater assurance that critical portions of programs are correct.

The purpose of this part of the Rodeo is to determine which formal methods, tools, and techniques can be used now (or have the potential in the near future) to gain higher assurance of statistical software with reasonable resources.

Our plan is to

  1. Contact key researchers and institutions acquainted with the purpose and architecture of statistical software. NIST will develop a list of current challenges facing those who are developing statistical software, especially errors that are difficult or impossible to detect with current methods.
  2. Conduct a preliminary survey of types of existing or prototype tools based on formal methods, proofs of concepts, or building blocks. The survey will include their capabilities.
  3. Hold a one-day workshop to get additional input. It will be co-located with High Confidence Software and Systems Conference (HCSS) Conference in Annapolis, Maryland, USA, on Wednesday, 1 May, 2019. (Jonas Green room.)
  4. Deliver a report containing the survey, a summary of key research and researchers, workshop participant input, and an analysis of current state of the art and a plan for progress.

This is part of the Rodeo is funded. We will deliver the report no later than the end of fiscal year 2019. We plan to deliver it much sooner.

Details of the Rodeo

Test Suite

We want the test suite to be realistic, scalable software and specs with a variety of properties to “prove.”


  • Computational or analytical software
    • Not embedded, distributed, real time, etc.
  • Open source; good quality; widely used
    • We may take production code as a basis to synthesize code or mix in generated code.
  • Some software not in C
  • Prefer software that already has formal specs

To select the production software, we will develop a tool (or tool suite) that measures structural properties or traits of code. These traits are those embodied in the benchmarks of other competitions, such as depth of recursion or number of states. We will calibrate this measurement tool against other benchmarks. We can then use this tool to scan thousands of applications to select those to be used in the test suite.

The test suite can effectively be scaled by selecting other software of the proper complexity.

A Software Laboratory

Like a research chemist or physcist needs a stocked laboratory with skilled staff to assemble, calibrate, and run equipment, we need a software laboratory.

It will consist of programs and tools to manipulate, transform, and analyze software . . . and the in-house expertise to use them.

We need this to quickly get an idea of the potential of new techniques and approaches. Also to confirm (or refute) hypothesis, e.g., would we detect this pattern if loops were unrolled or spread across function invocations.

The following types of tools could be in the laboratory:

  • Semantic and structural search tools
  • Transformers capable of obfuscation and other changes
  • Tools to refactor and combine common code
  • Inlining and macro expansion tools to do the inverse
  • Program mutators or bug injectors
  • Code generators to produce many different programs with specified characteristics
  • Various compilers and interpreters (many in existence)
Examples of tools are LLVM, ROSE, Txl, Frama-C, DMS, and Bandera.

Please direct suggestions, comments, or questions to
Paul Black

The FMSwVRodeo web site was created Fri Nov 9 10:42:53 2018

by Paul E. Black  (
Updated Thu Apr 18 11:46:09 2019
by Paul E. Black  (

This page's URL is