Golden Model Wiki — STIMSMITH

Definition

A golden model (also called a reference model) is a high-level software model of a processor or design used as the expected-behavior baseline during verification. It is characterized as fast and uncomplicated, omitting implementation details such as pipeline depth, buffer sizes, or branch prediction, and updating architectural state at instruction-level granularity rather than cycle-by-cycle (Kabylkas et al., MICRO-54, 2021; survey on hardware fuzzing, Boston University).

In the RTL-fuzzing literature, the golden model is explicitly described as an ISA-level simulation result used for differential comparison: "DIFUZZRTL keeps comparing an execution result of an RTL design with that of a golden model (i.e., an ISA-level simulation result), thus detecting the bugs at ISA level." (Hur et al., DIFUZZRTL)

Role in co-simulation

In microprocessor verification, the common practice is to build a co-simulation infrastructure that compares the design-under-test (DUT) execution against the high-level software model of the design, also known as the "golden" model. The underlying idea is simple: when the same code is run on the DUT and on the model, the architectural state of both must be the same at any given moment. When a mismatch is detected, reasons are investigated and bugs are uncovered.

Golden-model comparison addresses the fact that randomly generated verification tests are not naturally self-checking: instead of requiring each random test to know its own expected result, the infrastructure compares the DUT execution against the reference model to obtain automatic pass/fail behavior.

Comparison strategies

A simple strategy is end-of-simulation comparison. The same code is run on both the reference model and the RTL implementation, then at the end of the simulation the register-file states and memory are dumped and compared; if any values do not match, the reasons are investigated. This approach is inexpensive, but has two drawbacks:

A buggy behavior reflected in the architectural state can be overwritten and hidden by later correct execution.
Even when a mismatch is detected, debugging may start far from the original point of divergence.

A more immediate comparison can halt execution close to the divergence and report the stimulus that caused it, simplifying debugging because the engineer starts the investigation at the point closest to the divergence. To support asynchronous interrupts, the setup must also support messaging that overwrites the emulator's execution path: when the RTL flags an interrupt, it must inform the emulator so that the model follows the interrupt-driven execution path.

Use as an ISA simulator (oracle)

In hardware-fuzzing literature, the golden model is concretely an ISA simulator — a software model of the hardware that does not require any low-level microarchitectural details. For a given program, it computes the values of architectural registers and memory state after the execution of each instruction. By contrast, the RTL simulator is cycle-accurate and realizes the effect of executed instructions at the microarchitectural level. The hardware fuzzer extracts an execution trace log from both the ISA simulator and the RTL simulator for the same input and cross-checks the traces; any mismatch is treated as a potential bug in the processor and marked for further investigation by the verification engineer.

Differential testing with a golden model

Golden-model-based checking is the canonical instance of differential testing applied to hardware. As described in the DIFUZZRTL paper, "such differential testing techniques are also used for RTL verification as well, particularly comparing one RTL's execution results with a golden model's execution results, which inspired the design of DIFUZZRTL." (Hur et al.) The fuzzer combines coverage-guided fuzzing (a dynamic-testing approach to explore hardware logic in the RTL design) with differential comparison against a golden model (to clearly identify an RTL vulnerability at the ISA level). The same paper notes that DIFUZZRTL is reported as having found the first and only CVE vulnerabilities in any RISC-V cores, including a BOOM bug analogous to the Intel Pentium FDIV defect.

RISC-V case: Spike in the ElectraIC Advanced Verification Suite (EAVS)

In RISC-V verification, Spike — the RISC-V ISA simulator officially released by the RISC-V International Foundation and capable of simulating one or more RISC-V harts — is a commonly used golden model. The ElectraIC Advanced Verification Suite (EAVS) is an example verification environment for any RISC-V core, consisting of an Instruction Set Simulator (ISS), YAML configuration files, and a RISC-V Core UVM Testbench. In EAVS:

The ISS operates as the core's reference model, acting as a golden model to determine whether the core's executed instructions are correct.
Spike's complex log format is converted into a .csv file for downstream comparison.
Spike currently runs externally rather than being integrated into the UVM environment.
The memory map is configured to avoid conflicts with Spike's embedded memory map.
The suite was demonstrated by verifying the cv32e40p core.

Use as a bug-discovery oracle in RTL hardware fuzzing

A golden model is the dominant bug-discovery mechanism in several RTL hardware fuzzers. TheHuzz (RISC-V processors including Rocket Chip, Ariane, mor1kx, or1200), DIFUZZRTL (BOOM, mor1kx, Rocket Chip), Trippel et al.'s work on RISC-V IP cores (AES, HMAC, KMAC, Timer), Logic Fuzzer (BlackParrot, BOOM, CVA6), and ProcessorFuzz (BOOM, BlackParrot, Rocket Chip) all use a golden model to cross-check RTL behavior. Other fuzzers rely on alternative mechanisms such as assertions, property checks, or coverage-based bug-finding.

Dromajo as a co-simulation golden model

Dromajo is an RV64GC emulator designed specifically for co-simulation purposes. It can boot Linux, handle external stimuli such as interrupts and debug requests on the fly, and integrate into existing testbench infrastructure with minimal effort. In the cited MICRO-54 work, integrating Dromajo into verification of three RISC-V cores (CVA6, BlackParrot, BOOM) found nine bugs, and enhancing it with the Logic Fuzzer increased the exposed bug count to thirteen without creating additional verification tests.

Limits and golden-free alternatives

Golden models are not always available or desirable, and building them can be costly. In hardware-Trojan detection, many existing detection methods rely on golden models and detailed circuit specifications, and are often specific to certain Trojan payload types, making pre-silicon verification difficult and creating security gaps. Newer "golden-free" approaches aim to avoid this requirement:

A formal property-checking method for Trojan detection in non-interfering accelerators at the RTL reports exhaustive detection of any sequential hardware Trojan independently of its payload behavior (including physical side channels), without requiring a golden model or functional specification (arXiv:2312.06515).
A run-time Trojan-detection approach using a Programmable Sensor Array (PSA) — a tampering-resilient on-chip magnetic-field sensor array — is described as golden-model free and was demonstrated on an AES-128 test chip with four AES hardware Trojans (arXiv:2401.12193).

Golden Model

Definition

Role in co-simulation

Comparison strategies

Use as an ISA simulator (oracle)

Differential testing with a golden model

RISC-V case: Spike in the ElectraIC Advanced Verification Suite (EAVS)

Use as a bug-discovery oracle in RTL hardware fuzzing

Dromajo as a co-simulation golden model

Limits and golden-free alternatives

See also

LINKED ENTITIES

CITATIONS

VERSION HISTORY