Overview
Micro-architectural bug detection is illustrated in the provided evidence as part of processor verification at the RTL/core-implementation level. The cited approach uses cross-level processor verification based on endless randomized instruction stream generation, with Coverage-guided Aging used to improve the distribution of exercised coverage points. The paper reports that this method achieved a more regular coverage distribution and found an intricate micro-architecture-related bug in an already heavily tested industrial processor and its accompanying test-bench infrastructure.
Verification setting
The evidence describes a cross-level verification flow in which separate random instruction generators are initialized with the same cryptographic seeds so that they produce the same endless randomized instruction stream. Instructions are first generated and executed by an instruction set simulator (ISS), while the RTL processor later fetches the stream. Because RTL fetching depends on implementation details, the flow explicitly accounts for micro-architectural behavior such as pipelining, prefetching, and fetch-buffering through a core adapter.
Example bug detected
The reported bug involved pipeline/test-bench interaction. The evidence states that entries in the execute FIFO of the pipeline prevented the core from receiving further instructions. This was triggered because the test-bench adapter emptied the pipeline only when a valid instruction was executed. As a result, a test case could trigger the error if the core ran too many invalid instructions in succession, specifically within the reported “Special & System : Special & System” coverage category.
Role of Coverage-guided Aging
Coverage-guided Aging is presented as an extension to cross-level processor verification. In the case study, it complemented randomized testing by helping close coverage gaps and producing a more regular coverage distribution. The authors also identify future work around more advanced micro-architecture coverage metrics, including metrics for testing features such as pipeline hazard handling.
Related work context
The evidence contrasts this approach with other processor test-generation methods, including model-based generators, Bayesian-network or machine-learning-guided generation, fuzzing, and symbolic execution. It states that some of these approaches are not designed for RTL verification or restrict generated instruction streams, while the cited approach is tailored to cross-level RTL-oriented verification.