Bug Detection
Overview
Bug detection in hardware verification refers to the identification of discrepancies between a high-level design specification and its low-level implementation. In the tandem simulation methodology described by Xing, Gupta, and Malik (ASPDAC 2022), bug detection is realized by comparing the architectural variables produced by an instruction-level execution model (ILEM) against those produced by an RTL-based execution model (RTEM). Any deviation between the two views is treated as a potential bug that can be localized by examining nearby instructions.
Mechanism in Tandem Simulation
Tandem simulation combines the ILEM (derived from the Instruction Set Architecture or, more generally, an Instruction-Level Abstraction) and the RTEM into a cross-level execution model (CLEM). At the end of each instruction, an AV-Check compares the instruction-level architectural variables (ILAVs) with the corresponding RTL architectural variables (RTAVs). When these disagree, the deviation is flagged as a potential bug. The AV-Check can also be invoked at chosen checkpoints (intervals of multiple instructions) to reduce per-instruction comparison overhead, and an AV-Swap operation can transfer ILAV values into the RTAVs to jump-start the RTEM after a warm-up phase.
Because comparison occurs at instruction boundaries rather than at the end of a full simulation trace, the technique falls into the category of instruction-by-instruction bug detection, in contrast to run-to-the-end conformance testing where the ILEM and RTEM are only compared after the complete test has executed.
Bug Categories Studied
The authors evaluated bug detection on three categories of artificially inserted bugs:
- Condition bug — modifies a value or condition inside an
if-then-elseorcasestatement (the canonical example is the AES-round condition bug identified in the case-study designs). - Data bug — changes a value used in a computation.
- Expression bug — changes a logic operator, e.g., replacing an AND/OR with an XOR.
Each bug was inserted at a randomly chosen location among the tens or hundreds of candidates available in the design, producing three buggy variants per case study.
Bug Detection Time Improvement
When comparing tandem simulation (instruction-by-instruction AV-Check) against traditional conformance testing (run-to-end comparison) on the same buggy variants:
- Tandem simulation often detects the bug earlier than finishing the test under conformance testing.
- In many cases the bug is found in less than 10% of the full test time, and in most cases in less than 40%.
- An outlier is a data bug in the FlexNLP design, where the buggy data is only used in a very late stage of the test program, delaying detection.
- The absolute simulation times for the run-to-end strategy on the studied designs range from roughly 1–15 seconds across design variants.
The authors note that AV-Swapping — a one-time overhead for jumping from ILEM into the RTEM — is negligible in practical tests of millions of instructions, provided it is not invoked too frequently.
Relationship to Other Concepts
Bug detection in this framework is a property enabled by Tandem Simulation, which is the cross-level simulation technique that performs the instruction-by-instruction ILEM/RTEM comparison. Tandem simulation additionally supports jump-starting (using AV-Swap) to skip warm-up phases and accelerate bug detection further.
Practical Significance
Within the seven case studies (including processors such as Rocket Core and accelerators such as AES-block, AES-round, GB, FlexNLP, Pico, and Piccolo), the empirical results support two main claims about bug detection:
- The instruction-by-instruction checking detects bugs earlier than run-to-the-end methods.
- Automation of the ILEM/RTEM connection — using the ILA model and its refinement map — makes this form of bug detection practical without requiring manual synchronization or controller construction between the two models.