Test Vector Generation Wiki

Overview

In the cited processor-verification workflow, test vector generation is performed by a coverage-guided fuzzer. The fuzzer emits a test vector into a co-simulation environment containing an instruction set simulator (ISS) and an RTL core. The co-simulation is instrumented to collect coverage, and its coverage plus return code are returned to the fuzzer as execution feedback, in the described approach through shared memory.

Feedback-guided generation loop

The workflow uses execution feedback to guide subsequent generation. The fuzzer receives coverage and return-code information, collects generated test vectors, and categorizes them into two sets: vectors where both processors show equal behavior and vectors that trigger a behavior mismatch. The fuzzing run stops when a configured fuzzing timeout is reached.

Mutations for more effective vectors

The evidence describes custom mutations as a way to improve fuzzing efficiency. These mutations include insertion and replacement variants: insertion can make a test vector longer, while replacement keeps the test vector size unchanged. The approach also includes improvements for RISC-V Control and Status Register (CSR) testing by inserting or replacing CSR instruction pairs. In that CSR pattern, the first instruction writes a CSR and the second reads the same CSR, so CSR misbehavior is propagated into a register and can be detected by the execution controller.

Post-processing generated vectors

After generation, many test vectors may reveal the same bug. To reduce manual analysis effort, the described method adds an automatic post-processing step that clusters test vectors detecting the same bug. Each cluster is represented by a unique test vector that behaves like the others in the cluster. For clustering, the authors use a custom co-simulation version that logs executed instructions and corresponding addresses; this logging version is not used during fuzzing because hard-disk writes make it slower, and it does not need the coverage instrumentation required for fuzzing. The post-processing then extracts the instruction that leads to the bug.

Example evaluation setting

The cited case study evaluates fuzzing combined with co-simulation for processor verification. It uses the open-source RISC-V VexRiscv processor as the RTL device under test and an ISS extracted from the open-source RISC-V VP as the reference. The RTL core is translated to C++ using Verilator and embedded with the ISS in a common SystemC testbench. The evaluation uses AFL 2.56b as the out-of-process fuzzer baseline.