Skip to content
STIMSMITH

Test Case Shrinking

Technique WIKI v1 · 5/30/2026

Test case shrinking is used in RISC-V CPU verification to turn generated instruction sequences into smaller counterexamples or minimal tests. In TestRIG, direct instruction injection makes shrinking sequences with branches straightforward, and shrink/no-shrink annotations can preserve deterministic setup while isolating failing behavior.

Overview

Test case shrinking is a verification technique for reducing generated instruction sequences into smaller counterexamples or minimal tests while preserving the behavior needed to demonstrate a bug. In the supplied RISC-V CPU verification evidence, shrinking appears in two closely related forms: TestRIG uses direct instruction injection to make shrinking instruction sequences, including sequences with branches, straightforward; Symbolic QED is described as generating minimal tests for verification, including post-silicon verification, using a formal model of the pipeline.

Use in TestRIG

TestRIG is a randomized RISC-V CPU testing ecosystem based on RVFI-DII instrumentation and direct instruction injection. The evidence states that instruction injection “allows straightforward shrinking of sequences with branches.” This is significant because branches can otherwise make instruction-stream reduction difficult: removing or changing instructions may alter control flow and invalidate the failing behavior.

TestRIG examples also show shrink-control annotations in reduced counterexamples. One shrunken counterexample for a CHERI cSetBoundsImmediate vulnerability contains .noshrink, .shrink, and .assert directives. The .noshrink regions preserve setup and deterministic observation state, while the shrinkable region isolates the illegal capability-bounds operation and subsequent load that demonstrates the bug. The evidence explains that .noshrink was required to initialize performance counters so that the final assertion on the L1 cache-miss counter was deterministic.

Example: shrinking a CHERI counterexample

The TestRIG paper gives a shrunken counterexample involving cSetBoundsImmediate. CHERI only allows capability bounds to be reduced, so the instruction is illegal when it attempts to enlarge bounds and should throw an exception. The shrunken sequence nevertheless demonstrated that the capability that would have been produced could be forwarded during a pipeline flush, causing a cache fill and potentially enabling side-channel attacks.

This example illustrates why shrinking is useful in hardware verification: it turns a discovered failure into a compact, interpretable counterexample that separates necessary setup from the minimal failing behavior.

Relationship to minimal-test generation

Symbolic QED is described as another approach that generates minimal tests for verification, including post-silicon verification, using a formal pipeline model. This places it near the same goal as test case shrinking: producing small tests or counterexamples that expose failures with less irrelevant instruction context.

Practical role

Within the supplied evidence, shrinking supports debugging and regression work by making failures easier to understand. TestRIG is reported to find architectural bugs, microarchitectural mistakes such as register-forwarding or pipeline-flush problems, memory mistakes such as cache bugs or memory speculation failures, and unexpected feature interactions. Shrinking helps convert those discoveries into concise counterexamples suitable for analysis.

CITATIONS

6 sources
6 citations
[1] TestRIG uses direct instruction injection, and instruction injection allows straightforward shrinking of instruction sequences with branches. Randomized Testing of RISC-V CPUs using Direct
[2] A TestRIG shrunken counterexample used .noshrink, .shrink, and .assert directives to demonstrate a CHERI cSetBoundsImmediate vulnerability. Randomized Testing of RISC-V CPUs using Direct
[3] .noshrink was used to preserve initialization needed for a deterministic final assertion on the L1 cache-miss counter. Randomized Testing of RISC-V CPUs using Direct
[4] The cSetBoundsImmediate counterexample showed an illegal bounds-enlarging operation that threw an exception, while a forwarded capability during pipeline flush caused a cache fill that could lead to side-channel attacks. Randomized Testing of RISC-V CPUs using Direct
[5] Symbolic QED generates minimal tests for verification, including post-silicon verification, using a formal pipeline model. Randomized Testing of RISC-V CPUs using Direct
[6] TestRIG has been used to find architectural bugs, microarchitectural mistakes such as forwarding or pipeline-flush problems, memory mistakes such as cache bugs or memory speculation failures, and unexpected interactions between architectural features. Randomized Testing of RISC-V CPUs using Direct