Post-processing Test Vector Clustering Wiki

Overview

Post-processing Test Vector Clustering is a technique used after a coverage-guided fuzzing loop in a cross-level processor verification workflow. In the described flow, fuzzing first generates test vectors for co-simulation of an RTL processor core and a reference instruction set simulator (ISS); post-processing then reduces the generated set of mismatch-triggering test vectors. [C1]

The purpose of the technique is to reduce manual analysis effort for verification engineers by clustering test vectors that trigger mismatches and encapsulating vectors that detect the same bug. [C2]

Motivation

Fuzzing is described as an efficient verification methodology because it can generate many test vectors, reach high coverage, and uncover numerous bugs. However, after test generation, reported errors must be investigated carefully, and many test vectors may reveal the same bug. Clustering those vectors saves manual analysis time. [C3]

Method

The post-processing step operates on test vectors that caused mismatches during co-simulation. Each cluster is represented by a unique test vector that behaves like every other test vector in that cluster. [C4]

To support clustering, the co-simulation is compiled with more extensive logging instrumentation than the fuzzing configuration. The post-processing version logs all executed instructions together with their corresponding addresses, providing additional feedback needed for the clustering step. [C5]

This logging-enabled co-simulation is not used for fuzzing itself because the hard-disk write accesses make it much slower. In contrast to the fuzzing build, the post-processing co-simulation also does not need coverage instrumentation, which is essential during fuzzing. [C6]

After logging, the post-processing extracts the instruction that leads to the bug. The cited description notes that the post-processing distinguishes mismatches in two cases, but the provided evidence does not include the details of those cases. [C7]

Role in the verification flow

The broader verification approach consists of two subsequent steps: first, a coverage-guided fuzzing loop generates test vectors; second, post-processing reduces the generated set. In the fuzzing loop, generated test vectors are used as instruction streams for co-simulation of an RTL core under test and a reference ISS. An Execution Controller checks behavioral equality through register-value comparison and identifies mismatches. [C1]

Within that flow, Post-processing Test Vector Clustering is the reduction stage applied to mismatch-triggering test vectors, using additional co-simulation logging to group vectors that expose the same underlying bug. [C2]