Overview
Generating an Efficient Instruction Set Simulator from a Complete Property Suite presents an approach for generating a C++ instruction set simulator (ISS) from a gap-free, complete property suite formulated in an architectural style. The paper argues that, after successful formal verification, such a property suite forms an architectural model of the verified design; therefore, the generated ISS is equivalent to the design by construction. [C1]
The method is intended to use the formal property suite as a single source of specification for both verification and instruction-set simulation. The authors state that this supports correct early software development based on instruction set simulation and allows the simulator to be automatically adapted when the design or specification changes late in the design process. [C2]
Architectural-style property suite
The paper describes a reformulation of operation properties into an architectural style suitable for ISS generation. This style includes: explicit modeling of the architectural state and memory/port interfaces; a next_state macro for instruction and interrupt effects; interface macros for memories and ports; and an explicit architectural reset state. [C3]
The architectural state is modeled using a user-defined VHDL record data type that combines the state elements of the processor, such as the register file, status flags, and program counter. Similar user-defined data types describe interfaces to memories and ports. [C4]
A central modeling construct is the update keyword, which explicitly defines write access to an array or record data structure. In the example described by the paper, next_state updates the register file when the decoded opcode corresponds to an ADD instruction, and the decoded instruction fields are stored in an instruction_t record. [C5]
Relationship to formal verification
The paper describes an equivalence-proof structure in which an abstraction function vstate maps implementation state to ISA state. The proof objective is that applying next_state at the ISA level corresponds to applying the CPU transition relation and then mapping the resulting implementation state back to the architectural state. The proof also has to establish that interface signals behave as described by interface macros and that the implementation reset state complies with the architectural reset state. [C6]
However, the paper notes that a full equivalence proof between ISA and RTL, as well as identification of mapping functions between architectural and implementation state, is not required before generating the ISS. The ISA can be developed in ITL early in the design process and used to generate an ISS; full confidence that the ISS complies with the design is obtained at the end of verification. [C7]
ISS generation
The reformulated architectural-style property is described as a single property that captures all behavior of the verified design through the next_state function. The paper characterizes this artifact as a formally checkable ISA description and uses it as the starting point for automatic generation of a C++ ISS. [C8]
The generated simulator core is a C++ class named Sim, which contains the code for instruction execution and holds the architectural state. The translation from the ITL description to C++ includes the following steps: generating public functions for next_state, decode, and interface macros; generating private functions for remaining macros; generating a member variable for the architectural state; replacing ITL/HDL data types and operations with C++ types and operations; and replacing update expressions with direct C++ array or struct overwrites. [C9]
A user-provided wrapper is still required around the generated simulator class. This wrapper calls the generated public functions to trigger single-instruction execution and connects the simulator core to peripheral components such as external memories or buses. The paper also states that the generated core can be integrated with commercial simulation and debugging tools. [C10]
Code-generation optimizations
To improve simulation performance, the generated C++ code uses native data types such as unsigned integers when ITL/HDL bit sizes fit within the host system word width. Larger bit vectors are represented with dedicated data structures. Basic operations such as integer arithmetic and simple logic are mapped to native C++ operations, while more complex operations such as bit slicing, bit rotation, and operations on large bit vectors use an optimized support library. [C11]
The generator also analyzes data dependencies in macro functions to identify shared expressions. It can insert additional simulator-class member variables to store temporary and intermediate results, thereby caching computations during simulation. [C12]
A further optimization is decode-result caching: the decode macro decomposes an instruction word into bit fields, stores them in a record data type, and caches the result so repeated instruction decoding can be avoided. The paper relates this technique to just-in-time compiled simulation and notes that software locality, such as loops, makes it effective for reducing simulation runtime. [C13]
Experimental results
Small pipelined processor
The first evaluation generated an ISS for a small pipelined processor. The CPU has eight 16-bit registers, a special interrupt-return-vector register, a five-stage pipeline, and a simple data-memory interface. Its instruction set contains seven instructions covering logic, arithmetic, memory access, and jump instructions. [C14]
For this processor, the paper compares three simulator implementations. An interpretive commercial-tool simulator achieved 0.22 MIPS, a just-in-time compiled simulator achieved 14.0 MIPS, and the ISS generated from the property suite achieved 7.0 MIPS. The paper states that this clearly outperforms interpretive simulation and reaches about 50% of the performance of a state-of-the-art JIT-CS simulation tool. [C15]
Industrial processor design
The second experiment generated an ISS for an industrial processor design. The CPU has 64 32-bit registers in multiple hardware contexts, a seven-stage pipeline, connections to data memory and a bus interface, and support for 88 instructions based on the DLX instruction set architecture. The processor core is about 10,000 lines of VHDL, and the final reformulated property suite is 2,000 lines of ITL. [C16]
The paper states that the property suite and its completeness were successfully checked against the processor design using OneSpin Verification Solutions, so it could be considered a correct and complete specification. [C17]
For the industrial design, a just-in-time compiled simulator built with the commercial tool suite averaged 2.5 MIPS, while the ISS generated by the paper's approach reached 1.2 MIPS. The authors state that this confirms that the generated ISS has performance comparable to modern custom-made instruction set simulators. [C18]
| Design | Interpretive | JIT-CS | Generated |
|---|---|---|---|
| P1 | 0.22 MIPS | 14.0 MIPS | 7.0 MIPS |
| P2 | - | 2.5 MIPS | 1.2 MIPS |
The paper attributes the remaining performance gap to technical issues, noting that commercial JIT-CS tools include many optimizations, while the presented generator is a proof of concept. It also notes that properties generated from hardware reflect hardware and pipeline effects that may not appear in a high-level ISA description and may reduce simulation performance. [C19]
Conclusion
The paper concludes that a complete architectural-style property suite can be used to generate a C++ ISS whose performance is comparable to state-of-the-art commercial tools after code-generation optimizations. It presents the method as a practical way to obtain a proven-correct and efficient instruction set simulator when formal verification is already used in industrial processor design. [C20]
Funding acknowledgment
The research was supported by the German Federal Ministry of Education and Research (BMBF) in the HERKULES project under contract number 01M3082. [C21]