Skip to content
STIMSMITH

Model-based Random Testing

Concept WIKI v1 · 5/27/2026

Model-based random testing is a pragmatic verification approach in which randomly or directed-randomly generated test sequences are run against a reference model and an implementation, with execution traces compared to find divergences. In the RISC-V context, it is used when full formal equivalence of complex microarchitectures is difficult; it cannot prove equivalence, but it can refute it by producing counterexamples.

Overview

Model-based random testing is a functional verification technique that compares an implementation against a model while executing generated test sequences. In the cited RISC-V context, the approach is motivated by the difficulty of formally proving equivalence for complex microarchitectures. Rather than proving equivalence between a formal model and an implementation, model-based random testing can detect divergences and refute equivalence with counterexamples. [C1]

Method

A typical model-based random-testing workflow generates test programs or instruction sequences and executes them on both a golden model and a processor implementation under development. Divergence is commonly detected by comparing execution traces. [C2]

Directed-random test-sequence generation has been used to debug pipeline and memory bugs and to uncover unexpected divergences in implementation behavior. In the RISC-V ecosystem, examples of generators include RISC-V RTG and RISCV-DV; the cited source describes RISCV-DV as an advanced RISC-V sequence generator that works well where detailed traces can be compared. [C3]

RISC-V use case

For RISC-V, RISCV-DV generates assembly programs that are converted into in-memory images for execution. Its generators cover RV32IMAFDC and RV64IMAFDC and include support for page-table interactions, privileged CSR use, and traps or interrupts. The generated programs are executed on both a golden model and a processor in development, and divergence is typically detected through trace comparison. [C4]

Relationship to formal verification

Model-based random testing complements, but does not replace, model-based formal verification. Formal approaches using RVFI tracing and tools such as JasperGold can prove equivalence between traces from a simple HDL model and a pipelined HDL implementation, but the cited source notes limitations: such tools can handle only in-order pipelines and require specialist knowledge. As a result, formal verification does not yet replace functional testing for entire processors in that context. [C5]

Limitations

The cited source identifies several drawbacks of randomly generated tests. Automatically generated counterexamples can be long and convoluted compared with hand-written tests, and the generator must ensure that useful instructions exist at the targets of randomly generated branches. [C6]

Automated reduction can mitigate this problem. PyH2P is described as applying automated test-case reduction to randomly generated RISC-V instruction sequences, often producing sequences with fewer than five instructions where each instruction is meaningful for reproducing the error. However, the same source notes shortcomings: PyH2P does not perform full trace comparison with its internal PyMTL3 model, has difficulty shrinking through branches because it must produce a valid in-memory program, and does not use community-standard interfaces proven across a range of implementations. [C7]

Use in TestRIG

TestRIG applies this style of testing through an interactive Verification Engine. In TestRIG, the Verification Engine stimulates RISC-V implementations over RVFI-DII sockets, injects instruction sequences, and compares execution traces until it finds a divergence. A Verification Engine can drive one or more RVFI-DII-compatible implementations, either using an internal RISC-V model or comparing traces from two independent implementations. Its instruction sequences may be loaded from disk, generated randomly, or produced by interactive architecture-driven state-space exploration. [C8]

The cited TestRIG work focuses on comparing executable formal models, software ISA simulators, and simulated hardware designs rather than completed fabricated chips. This requires instrumenting CPU designs with a Direct Instruction Injection interface for tandem verification. [C9]

LINKED ENTITIES

1 links

CITATIONS

9 sources
9 citations
[1] Model-based random testing detects divergence from a model and can refute, but not prove, equivalence between a formal model and an implementation. Randomized Testing of RISC-V CPUs using Direct
[2] Generated test programs are executed on both a golden model and a processor in development, with divergence typically detected by comparing execution traces. Randomized Testing of RISC-V CPUs using Direct
[3] Directed-random test-sequence generation has been used to debug pipeline and memory bugs and uncover unexpected implementation divergences; RISC-V generators include RISC-V RTG and RISCV-DV. Randomized Testing of RISC-V CPUs using Direct
[4] RISCV-DV generates RISC-V assembly programs for execution and includes RV32IMAFDC and RV64IMAFDC generators with support for page-table interactions, privileged CSR use, and traps or interrupts. Randomized Testing of RISC-V CPUs using Direct
[5] Formal RISC-V verification using RVFI tracing and tools such as JasperGold can prove trace equivalence in some cases, but is limited to in-order pipelines and requires specialist knowledge, so it does not yet replace functional testing for entire processors. Randomized Testing of RISC-V CPUs using Direct
[6] Randomly generated tests can produce long, convoluted counterexamples and must ensure useful instructions at randomly generated branch targets. Randomized Testing of RISC-V CPUs using Direct
[7] PyH2P applies automated test-case reduction to randomly generated RISC-V instruction sequences and often produces short, meaningful reproducing sequences, but has limitations in trace comparison, branch shrinking, and interface standardization. Randomized Testing of RISC-V CPUs using Direct
[8] TestRIG's Verification Engine stimulates RISC-V implementations over RVFI-DII sockets, injects instruction sequences, and compares execution traces until divergence; sequences may be loaded from disk, generated randomly, or produced by interactive architecture-driven exploration. Randomized Testing of RISC-V CPUs using Direct
[9] The cited TestRIG work compares executable formal models, software ISA simulators, and simulated hardware designs rather than completed fabricated chips, requiring a Direct Instruction Injection interface for tandem verification. Randomized Testing of RISC-V CPUs using Direct