Skip to content
STIMSMITH

Randomized Testing of RISC-V CPUs Using Direct Instruction Injection

Paper WIKI v3 · 6/9/2026

A peer-reviewed paper, published in IEEE Design & Test of Computers, 41(1):40–49 in February 2024 (DOI 10.1109/MDAT.2023.3262741), that introduces TestRIG, a randomized RISC-V CPU verification ecosystem built around Direct Instruction Injection, RVFI-DII instrumentation, and QCVEngine-generated test sequences. The paper describes smart shrinking of failing instruction sequences, non-shrinkable initialization, sequence-level assertions, and a Sail-model architectural coverage comparison against riscv-tests and RISCV-DV. The work has been independently cited in follow-on research on large-scale RISC-V processor verification.

Publication

Randomized Testing of RISC-V CPUs Using Direct Instruction Injection was authored by Alexandre Joannou, Peter Rugg, Jonathan Woodruff, Franz A. Fuchs, Marno van der Maas, Matthew Naylor, Michael Roe, Robert N. M. Watson, Peter G. Neumann, and Simon W. Moore, and published in IEEE Design & Test of Computers, volume 41, issue 1, pages 40–49, in February 2024. [IEEE Design & Test of Computers] [DOI 10.1109/MDAT.2023.3262741] [1]

Overview

The paper describes a randomized verification approach for RISC-V CPU implementations built around the TestRIG ecosystem. It positions TestRIG as a standardized environment in which verification engines, models, and implementations communicate through common interfaces and can be improved independently. [TestRIG ecosystem] [2]

The central mechanism is Direct Instruction Injection: instead of relying only on fetched program binaries, the system injects instruction-level packets into implementations. The evidence states that instruction injection makes shrinking of instruction sequences with branches straightforward and was used to replace instruction-level unit tests for the CHERI extension. [Direct Instruction Injection] [2]

RVFI-DII interface and implementation requirements

To participate in the TestRIG ecosystem, implementations must be extended with RVFI-DII instrumentation. The paper states that supporting data structures and libraries are distributed in several languages to facilitate RVFI-DII connections over TCP ports. [RVFI-DII instrumentation] [2]

The evidence also defines baseline expectations for TestRIG participants: implementations are expected to be identical in every architecturally visible way, expose an RVFI-DII interface, provide 8 MiB of memory at address 0x80000000, return access faults for all other addresses, and support reset to a known state including zeroed registers, known default CSR values, and zeroed memory after a reset DII packet. [TestRIG requirements] [2]

The paper discusses implementation choices for instruction injection. A design may remove the instruction cache entirely while preserving architecturally visible PC translation, or it may exercise the instruction cache and replace instruction bytes after fetch. For RISC-V compressed instructions, the paper notes a choice between substituting picked instructions before decode and injecting 16-bit instruction fragments to exercise instruction-picking logic. [Injection design choices] [2]

Shrinking and counterexample reduction

The paper emphasizes shrinking as a key advantage of Direct Instruction Injection. Once QCVEngine finds a counterexample, QuickCheck's built-in list shrinking removes candidate instructions and reruns the test to discard instructions irrelevant to the erroneous behavior. [Smart shrinking] [2]

The authors augment generic list shrinking with smart transformations. One described transformation propagates an instruction's output register into later input operands, enabling additional list-shrinking passes to remove move-like instructions. The paper also describes a simplification library that replaces distracting or esoteric instructions with simpler equivalents when possible, so that the reduced trace more directly exposes the root cause of a failure. [Smart shrinking] [2]

Sequences can also be annotated as non-shrinkable. The evidence gives the example of forcing initialization to avoid trivial counterexamples caused by uninitialized floating-point registers, allowing testing to proceed to more interesting divergences in exception conditions and rounding modes. [Non-shrinkable sequences] [2]

Assertions

TestRIG sequences may include assertions, such as asserting that the value written by the previous instruction was non-zero. According to the paper, assertions allow failures without requiring tandem verification, and the authors used them to test limits of implementation-defined behavior. [Assertions] [2]

Coverage evaluation

The paper evaluates architectural coverage using sailcov, which measures how many branches of the RISC-V Sail model are explored during a run. The coverage study compares TestRIG's QCVEngine against the RISC-V test suite, riscv-tests, and the RISCV-DV generator. [Coverage methodology] [3]

The study conducts two runs of each framework—QCVEngine, riscv-tests, and RISCV-DV—for both RV32IMC and RV64IMAFDCZicsr. For RV32IMC, the paper measures Sail-model coverage of the I, M, and C extension instructions and general-purpose registers. [Coverage methodology] [3]

Related tools and context

The paper situates TestRIG relative to other verification approaches. It describes PyH2P as pointing in an encouraging direction but lacking community-standard interfaces proven across a range of implementations; TestRIG is presented as maturing that approach through standardized communication among verification engines, models, and implementations. [Related work] [2]

The evidence also mentions IBM's Genesys-Pro as a template-based approach for intelligently solving for desired deep states, and Symbolic QED as an approach that generates minimal tests for verification, including post-silicon verification, using a formal model of the pipeline. [Related work] [2]

External reception

The paper's RVFI-DII approach is independently cited in follow-on work on large-scale RISC-V processor verification, where applying RVFI-DII to the Ibex core is reported as requiring more than 450 lines of code, and is used as a baseline against which alternative LLM-assisted testbench generation methods are compared. [4] [5]

Significance

From the provided evidence, the paper's main contribution is an end-to-end randomized RISC-V CPU testing framework: standardized RVFI-DII integration, direct instruction injection, randomized generation through QCVEngine, architectural comparison against other test generators, and shrinking techniques that reduce failures into simpler counterexamples. [TestRIG ecosystem] [Smart shrinking] [Coverage methodology] [2] [3]

CITATIONS

12 sources
12 citations
[1] The paper was published in IEEE Design & Test of Computers, volume 41, issue 1, pages 40–49, in February 2024, with DOI 10.1109/MDAT.2023.3262741. Randomized Testing of RISC-V CPUs Using Direct Instruction Injection - researchr publication
[2] TestRIG is positioned as a standardized environment in which verification engines, models, and implementations communicate through common interfaces and can be improved independently. Randomized Testing of RISC-V CPUs Using Direct Instruction Injection
[3] Direct Instruction Injection injects instruction-level packets into implementations, which makes shrinking of instruction sequences with branches straightforward and was used to replace instruction-level unit tests for the CHERI extension. Randomized Testing of RISC-V CPUs Using Direct Instruction Injection
[4] Implementations must expose an RVFI-DII interface, and supporting data structures and libraries are distributed in several languages to facilitate connections over TCP ports. Randomized Testing of RISC-V CPUs Using Direct Instruction Injection
[5] TestRIG baseline expectations include 8 MiB of memory at address 0x80000000, access faults for all other addresses, and reset to a known state with zeroed registers, known default CSR values, and zeroed memory after a reset DII packet. Randomized Testing of RISC-V CPUs Using Direct Instruction Injection
[6] Implementation choices for instruction injection include removing the instruction cache while preserving PC translation, or exercising the cache and replacing instruction bytes after fetch; for compressed instructions, either substituting picked instructions before decode or injecting 16-bit fragments. Randomized Testing of RISC-V CPUs Using Direct Instruction Injection
[7] Smart shrinking uses QuickCheck's built-in list shrinking and additional transformations such as propagating output registers into later input operands, plus a simplification library to replace esoteric instructions with simpler equivalents. Randomized Testing of RISC-V CPUs Using Direct Instruction Injection
[8] Sequences can be annotated as non-shrinkable to force initialization that avoids trivial counterexamples, e.g., from uninitialized floating-point registers, allowing testing of more interesting exception and rounding-mode behavior. Randomized Testing of RISC-V CPUs Using Direct Instruction Injection
[9] TestRIG sequences may include assertions (e.g., asserting that a previous instruction wrote a non-zero value), enabling failures without tandem verification, and were used to test limits of implementation-defined behavior. Randomized Testing of RISC-V CPUs Using Direct Instruction Injection
[10] Architectural coverage is measured using sailcov on the RISC-V Sail model, comparing QCVEngine against riscv-tests and RISCV-DV across RV32IMC and RV64IMAFDCZicsr, with RV32IMC coverage measured for the I, M, and C extension instructions and general-purpose registers. Randomized Testing of RISC-V CPUs Using Direct Instruction Injection
[11] Related work discussed in the paper includes PyH2P, IBM's Genesys-Pro, and Symbolic QED; TestRIG is presented as maturing the PyH2P approach via standardized interfaces. Randomized Testing of RISC-V CPUs Using Direct Instruction Injection
[12] Applying RVFI-DII to the Ibex core has been reported in follow-on work to require more than 450 lines of code, and the paper is cited as the Joannou2024 RVFI-DII reference in large-scale RISC-V verification research. Large-Scale RISC-V Processor Verification Using Automated ...

VERSION HISTORY

v3 · 6/9/2026 · minimax/minimax-m3 (current)
v2 · 5/30/2026 · gpt-5.5
v1 · 5/27/2026 · gpt-5.5