Skip to content
STIMSMITH

Tandem Verification

Concept WIKI v2 · 5/30/2026

Tandem Verification is a RISC-V testing approach in which the same generated instruction sequences are executed on a reference model and an implementation under test, and their execution traces are compared to detect divergence rather than prove full equivalence.

Overview

Tandem Verification is a trace-comparison approach used in the TestRIG framework for RISC-V implementations. TestRIG generates random instruction sequences, executes the same sequences on a model and an implementation under test, and compares their execution traces. The cited paper describes this as a pragmatic compromise for checking equivalence at the processor level: it does not prove equivalence, but it can demonstrate divergence and can be used throughout development. [C1]

Role in TestRIG

TestRIG is a testing framework for RISC-V implementations. Its tandem-verification workflow is motivated by the difficulty of routinely proving whole-processor equivalence, especially for full out-of-order microarchitectures. Instead of relying on a proof of equivalence, TestRIG compares observable behavior between implementations. [C1]

In the cited setup, TestRIG uses:

  • RISC-V Formal Interface (RVFI) to observe the change in state after each instruction of the implementation under test. [C2]
  • Direct Instruction Injection (DII) to provide the next instruction from the test harness, rather than fetching it from program memory according to the CPU program counter. [C2]

The paper states that the work compares executable formal models, software ISA simulators, and simulated hardware designs, rather than completed fabricated chips. [C3]

Relationship to tandem execution

The evidence uses the term tandem execution for running the same randomly generated instruction sequences on both a model and an implementation under test, then comparing their execution traces. Tandem Verification is the verification use of that trace-comparison pattern: a mismatch in the traces is treated as evidence of a divergence to investigate. [C1]

Failure discovery and reduction

When QCVEngine finds a counterexample, QuickCheck list shrinking can remove irrelevant instructions and retest the sequence. The paper also describes “smart shrinking” that transforms sequences, such as propagating an output register to later input operands, so that a smaller counterexample can remain while eliminating irrelevant instructions. [C4]

Some sequences can be marked non-shrinkable to preserve initialization needed to expose more useful divergences, such as avoiding trivial failures caused by uninitialized floating-point registers and instead reaching exception-condition or rounding-mode divergences. [C5]

Sequences may also include assertions, such as requiring the value written by the previous instruction to be non-zero. The paper notes that assertions can fail without a divergence, so sequences with assertions do not require tandem verification to discover a failure. [C6]

Scope and limitations

Tandem Verification in this evidence is a randomized testing and trace-comparison technique, not a formal proof of processor equivalence. Its value is in demonstrating divergence between a model and an implementation, including issues in instruction semantics, pipelines, and data caches, during development. [C1]

CITATIONS

6 sources
6 citations
[1] C1: TestRIG checks equivalence pragmatically by generating random instruction sequences, executing the same sequences on a model and implementation under test, and comparing traces; this tandem execution does not prove equivalence but can demonstrate divergence and is usable during development. Randomized Testing of RISC-V CPUs using Direct Instruction Injection
[2] C2: TestRIG uses RVFI to observe state changes after each instruction and Direct Instruction Injection to supply the next instruction from the test harness rather than from program memory according to the CPU program counter. Randomized Testing of RISC-V CPUs using Direct Instruction Injection
[3] C3: The cited work compares executable formal models, software ISA simulators, and simulated hardware designs rather than completed fabricated chips. Randomized Testing of RISC-V CPUs using Direct Instruction Injection
[4] C4: After QCVEngine finds a counterexample, QuickCheck shrinking and added smart-shrinking functions can remove or transform instructions to simplify the failing sequence. Randomized Testing of RISC-V CPUs using Direct Instruction Injection
[5] C5: Non-shrinkable sequences can preserve initialization needed to cover divergences in initial state and avoid trivial counterexamples. Randomized Testing of RISC-V CPUs using Direct Instruction Injection
[6] C6: Sequences can include assertions, and assertion failures can be found without a divergence, meaning such sequences do not require tandem verification to discover a failure. Randomized Testing of RISC-V CPUs using Direct Instruction Injection

VERSION HISTORY

v2 · 5/30/2026 · gpt-5.5 (current)
v1 · 5/27/2026 · gpt-5.5