cycle-accurate simulation
Definition and purpose
Cycle-accurate simulation is the practice of constructing a software model of a digital system whose state advances one clock cycle at a time, in lockstep with the design under study. Such simulators are used both for early hardware/software co-design, where detailed processor or accelerator models are needed before silicon exists, and for verification, where one must reason about the exact sequence of states a circuit visits.
For processor design, detailed modeling and high-performance cycle-accurate simulators are described as essential for today's hardware and software design, and pose challenges that have been the subject of many research efforts (see Generic Pipelined Processor Modeling and High Performance Cycle-Accurate Simulator Generation, arxiv 0710.4643v1).
Cycle-accurate processor modeling
The Reduced Colored Petri Net (RCPN) approach is one technique for building such simulators. RCPN offers two advantages:
- It provides a simple, intuitive way of modeling pipelined processors, where the model mirrors the processor pipeline block diagram.
- It can generate high-performance cycle-accurate simulators, because it benefits from useful features of Colored Petri Nets while avoiding their exponential growth in complexity.
Using RCPN, the authors report that their generated cycle-accurate simulators for XScale and StrongArm processor models achieved roughly an order of magnitude (~15×) speedup over the popular SimpleScalar ARM simulator (arxiv 0710.4643v1).
Cycle-accurate simulation in High-Level Synthesis (HLS)
In FPGA HLS, a large semantic gap between the HLS design and the low-level (on-board or RTL) simulation environment often creates a barrier for non-FPGA experts, and low-level simulation can be very slow. Commercial software-based HLS simulators help bridge this gap and accelerate simulation, but they have been observed to sometimes produce incorrect results.
To solve this correctness issue while retaining the speed of a software-based simulator, the FLASH flow was proposed. FLASH extracts scheduling information from the HLS tool and automatically constructs an equivalent cycle-accurate simulation model while preserving C semantics. Experimentally, FLASH runs three orders of magnitude faster than RTL simulation (Rapid Cycle-Accurate Simulator for High-Level Synthesis, arxiv 1812.07012v2).
Cycle-accurate reasoning in hardware verification
Cycle-accurate simulation also underlies certain styles of hardware verification. SMT-based tools such as SymbiYosys can verify properties that hold after a small number of cycles by unrolling the transition relation. However, expressing the circuit transition as a relation means that, when the circuit state is primarily symbolic, each unrolled step grows the symbolic expressions, leading to a large final solver query; empirically, SMT solvers exhibit poor performance when reasoning about long chains of unrolled transitions [da13564e-2b08-4a7e-b7f8-9409cf1c1d13, ab47b904-296b-461d-8987-9bce0df2787b].
The rtlv tool takes a different approach. Instead of encoding the transition relation directly into an SMT query, rtlv transforms the transition relation into an imperative step function and symbolically executes the circuit cycle by cycle, leveraging Rosette's hybrid symbolic execution with type-driven state merging and on-the-fly rewrite rules. This makes it possible to reason in a cycle-accurate manner about software running on the hardware [1].
A concrete illustration: verifying that boot code clears all microarchitectural state in a PicoRV32 CPU requires modeling 104 cycles of execution from an initially unconstrained circuit state. Using SymbiYosys, the resulting unrolled solver query did not finish within 12 hours, while rtlv completes the same task in about 1.3 seconds [1].
See also
- RCPN (Reduced Colored Petri Net), a modeling formalism used to generate high-performance cycle-accurate processor simulators.
- FLASH, an HLS simulation flow that synthesizes a cycle-accurate model from HLS scheduling information.
- rtlv, a Verilog-to-Rosette compiler that performs cycle-accurate symbolic execution of circuits for verification.
- SimpleScalar, a widely used cycle-accurate processor simulator baseline.
- SymbiYosys, a hardware verification tool that uses SMT-based BMC over unrolled transition relations.
- Rosette, a solver-aided programming language embedded in Racket used by rtlv for symbolic execution.
- PicoRV32, a small RISC-V CPU used as a verification target in the rtlv paper.