Simulation Performance (MIPS) Wiki

Definition

In the evaluated instruction-set simulator (ISS) context, MIPS is used as a throughput metric meaning million instructions per second. The paper reports average ISS execution rates in MIPS when comparing interpretive simulation, commercial just-in-time compiled simulation (JIT-CS), and ISS code generated from a complete formal property suite. [C1]

Reported results

The evaluation in Generating an Efficient Instruction Set Simulator from a Complete Property Suite reports two processor experiments:

Design	Interpretive ISS	Commercial JIT-CS ISS	Generated ISS
P1: small pipelined processor	0.22 MIPS	14.0 MIPS	7.0 MIPS
P2: industrial processor design	—	2.5 MIPS	1.2 MIPS

For the small pipelined processor (P1), the generated ISS outperformed the interpretive simulator and reached about 50% of the performance of the commercial JIT-CS simulator. [C2]

For the industrial processor design (P2), the commercial JIT-CS simulator averaged 2.5 MIPS, while the generated ISS reached 1.2 MIPS. The paper states that these results confirm that the generated ISS performance is comparable to modern custom-made instruction-set simulators. [C3]

Evaluation context

The P1 experiment used a small processor with 8 16-bit registers, a special interrupt-return-vector register, a 5-stage pipeline, a simple data-memory interface, and an instruction set of 7 instructions covering logic, arithmetic, memory access, and jumps. [C4]

The P2 experiment used an industrial processor design with 64 32-bit registers across multiple hardware contexts, advanced processor features, a 7-stage pipeline, data-memory and bus interfaces, and 88 instructions based on the DLX instruction set architecture. Its processor core comprised about 10,000 lines of VHDL, and the final reformulated property suite comprised about 2,000 lines of ITL. [C5]

Factors affecting performance

The generated C++ simulator applied several optimizations intended to make performance comparable to custom state-of-the-art simulators. These included mapping sufficiently small ITL/HDL bit-vector data types to native C++ types, mapping basic arithmetic and logic operations to native C++ operations, using optimized library functions for more complex operations or large bit vectors, caching intermediate results based on data-dependency analysis, and caching instruction-decode results to avoid repeated decoding during simulation. [C6]

The authors attribute the remaining performance gap to commercial JIT-CS tools having many engineering optimizations, while their generated ISS approach was presented as a proof of concept. They also note that the generated properties reflect hardware and pipeline effects that may be absent from higher-level ISA descriptions and can reduce simulation speed. [C7]

Significance

The paper concludes that, after formal verification, a complete architectural-style property suite can serve as an architectural model for generating a C++ ISS. With code-generation optimizations, the resulting simulator performance is described as comparable to state-of-the-art commercial tools, while preserving the benefit that the ISS is equivalent to the verified design by construction. [C8]