Skip to content
STIMSMITH

Cache Bug Detection

Concept WIKI v1 · 5/30/2026

Cache bug detection is the process of exposing incorrect cache behavior in processor implementations. In the TestRIG RISC-V testing work, targeted randomized memory generators found cache-related bugs that had escaped static unit tests, including a Flute data-cache implementation mismatch and an overlapping-load/store counterexample reduced to three memory operations.

Overview

Cache bug detection focuses on finding incorrect behavior in a processor's cache subsystem, especially memory errors that are difficult to anticipate with static unit-test suites. In the TestRIG work on randomized RISC-V CPU testing, cache bugs are described as a class of memory mistakes that can be discovered efficiently with targeted generators, while remaining notoriously difficult to find using static unit tests. [C1]

TestRIG approach

The cited TestRIG case used a generator that constructed addresses within the TestRIG memory range and emitted random loads and stores. This approach was applied after a cache issue escaped the existing unit-test suite. [C2]

The value of the approach comes from producing small, reproducible counterexamples. In the Flute cache case, the generator found the bug after 42 tests and 20 rounds of shrinking, reducing the failure to a short instruction sequence. [C3]

Flute cache bug example

The reported processor was Flute, described in the paper as a working in-order RV64G design. TestRIG exposed that its data cache was implemented as direct-mapped and 4 KiB, rather than the specified 2-way associative and 8 KiB cache. A parameter experiment confirmed that the 2-way cache configuration could not boot the operating system. [C4]

The shortened counterexample contained only three memory operations: two loads with a single store between them, all targeting overlapping addresses. The final reload diverged. The paper reports that the counterexample was found less than 10 seconds into the TestRIG run and that the fix was completed within an hour. [C5]

Why reduced counterexamples matter

The Flute bug had escaped the processor's development process and the RISC-V unit-test suite. The authors state that it was overwhelmingly difficult to debug from a full software trace, but trivial to resolve once TestRIG provided the reduced counterexample. [C6]

Related cache-observable behavior

The same evidence also shows TestRIG using assertions over hardware performance counters, including an L1 cache miss counter, to make cache-visible effects deterministic in a shrunken counterexample involving an illegal CHERI bounds operation. The paper reports that a capability value forwarded during a pipeline flush caused a cache fill that could lead to side-channel attacks. [C7]

Relationship to counterexample-driven development

The paper frames TestRIG's model-based testing as supporting counterexample-driven development: instead of waiting for broad software traces or hand-written unit tests, developers receive reduced stimuli that can expose both basic bugs and advanced interactions. QCVEngine is mentioned in this context as providing a tight cycle of reduced counterexamples for CHERI work on Ibex. [C8]

CITATIONS

8 sources
8 citations
[1] Cache bugs are memory mistakes that TestRIG found efficiently with targeted generators, and they are hard to discover using static unit tests. Randomized Testing of RISC-V CPUs using Direct
[2] The cache-bug generator constructed addresses within the TestRIG memory range and generated random loads and stores after the bug was not found by the unit-test suite. Randomized Testing of RISC-V CPUs using Direct
[3] The Flute cache bug was discovered after 42 tests and 20 rounds of shrinking. Randomized Testing of RISC-V CPUs using Direct
[4] Flute's data cache was implemented as direct-mapped and 4 KiB rather than the specified 2-way associative and 8 KiB cache, and a parameter experiment found that the 2-way cache could not boot the operating system. Randomized Testing of RISC-V CPUs using Direct
[5] The reduced Flute counterexample had two loads with one store between them to overlapping addresses, was found less than 10 seconds into the run, and was fixed within an hour. Randomized Testing of RISC-V CPUs using Direct
[6] The Flute cache bug escaped the development process and RISC-V unit-test suite and was difficult to debug from a full software trace, but was easy to resolve with a TestRIG counterexample. Randomized Testing of RISC-V CPUs using Direct
[7] A TestRIG shrunken counterexample used an L1 cache miss counter assertion to observe a cache fill caused by forwarded capability data during a pipeline flush, which could lead to side-channel attacks. Randomized Testing of RISC-V CPUs using Direct
[8] TestRIG supports counterexample-driven development, and QCVEngine is described as providing a tight cycle of reduced counterexamples in CHERI Ibex work. Randomized Testing of RISC-V CPUs using Direct