Tandem Execution Wiki

Definition

Tandem execution is a model-comparison testing approach used in TestRIG for RISC-V implementations. TestRIG generates random instruction sequences, executes the same sequences on a model and on the implementation under test, and compares their execution traces. The source explicitly identifies this trace-comparison workflow as "tandem execution". [Tandem execution definition]

Purpose

The technique is used as a pragmatic compromise when full formal proof of equivalence between a processor implementation and the RISC-V Sail model is not routinely practical at whole-processor scale. Tandem execution does not prove equivalence, but it can demonstrate divergence between the model and the implementation, and the source states that it is usable in all stages of development. [Purpose and limitation]

Operation in TestRIG

In the TestRIG workflow, random instruction sequences are generated and executed on both sides of the comparison. The relevant architectural effects are observed as execution traces, which are compared to identify mismatches. TestRIG uses the RISC-V Formal Interface (RVFI) to observe the change in state after each instruction of the implementation under test. [Trace comparison and RVFI observation]

TestRIG also uses Direct Instruction Injection (DII) for test injection: instead of fetching the next instruction from program memory according to the CPU program counter, the test harness supplies the next instruction to execute. This injection mechanism is complementary to tandem execution because it supplies the instruction stream that is then compared through traces. [Direct Instruction Injection role]

Verification-engine form

Within TestRIG, an interactive Verification Engine (VEngine) stimulates RISC-V implementations over RVFI-DII sockets. An RVFI-DII-compatible implementation can reset, consume instruction sequences, and report execution traces through the interface. A VEngine may include an internal RISC-V model or may drive two independent implementations and compare their RVFI traces, as described for QCVEngine. [VEngine comparison model]

Scope

The evidence describes tandem execution in the context of comparing executable formal models, software ISA simulators, and simulated hardware designs, rather than completed fabricated chips. TestRIG has been used to test standard RISC-V extensions and the experimental CHERI security extension, and the authors report that it detects issues not only in instruction semantics but also in the pipeline and data caches. [Scope and reported effectiveness]