Skip to content
STIMSMITH

Reference Model

Concept WIKI v3 · 6/10/2026

A reference model is an executable or software model used in simulation-based functional verification to predict the expected behavior of a design for a given input, providing an oracle against which the device under test (DUT) is compared. Reference models are central to UVM scoreboard flows (e.g., Spike for RISC-V vector verification), to differential hardware fuzzing (e.g., DifuzzRTL), and to FPGA-accelerated co-verification frameworks (e.g., Lyra), where the reference model runs concurrently with the DUT on the same device.

Reference Model

Definition

In simulation-based functional verification, a design's actual behavior is checked by simulating the HDL implementation, driving stimuli into it, and comparing the observed behavior with the expected behavior implied by the specification. In this context, a reference model is the software or executable model that predicts how the design should behave for a given input. The reference model accepts instructions or stimuli as input and generates the expected results used to evaluate the device. The same term is also used in software-engineering discourse with a different meaning: an abstract framework for system-interaction semantics that classifies bidirectional interactions as horizontal (stateful, asynchronous, nondeterministic, described by protocols) or vertical (asymmetric, described by object models, operations, or anonymous events). The verification meaning is the one covered by the evidence in this article.

Role in a verification environment

A reference model provides the expected result stream against which the device under test (DUT) is checked. In the cited RISC-V vector processing unit (VPU) environment, a UVM scoreboard compares VPU results with results from the reference model. When an instruction completes, the scoreboard executes a comparison method; for vector instructions, the verification environment includes the destination vector register value extracted from the reference model so that it can compare the expected and observed register contents.

Reference models in hardware fuzzing

In differential hardware fuzzing, a reference model acts as a correctness oracle for randomly generated test programs. DifuzzRTL improved upon the earlier RFUZZ coverage-directed fuzzer by adding clock-sensitive optimization and incorporating a reference model, enabling better capture of state transitions and more effective checking of RTL execution results. The reference model in differential fuzzing is typically an ISA-level simulator executed in parallel with the DUT so that architectural-state mismatches can be flagged as bugs without requiring a hand-written checker for each instruction.

FPGA-accelerated concurrent execution (Lyra)

The Lyra RISC-V verification framework is a heterogeneous GPU-CPU-FPGA co-verification platform that places the reference model on the same FPGA System-on-Chip (SoC) as the DUT. Following the Encore architecture, the DUT runs on the programmable logic (PL) while a software ISA emulator serving as the reference model runs on the on-chip hardened ARM processors. Dedicated hardware checkers perform runtime differential checking of execution results at the instruction level, and register-level coverage points are instrumented directly on the FPGA so that coverage collection is no longer bottlenecked by software simulation. LyraGen, a 125-million-parameter domain-specialized generative model retrained from OPT-125M, produces the instruction streams that drive both the DUT and the reference model concurrently, with an on-FPGA encoding module translating each instruction into the model's tokenized format. This FPGA-resident reference model is what enables Lyra to report large end-to-end verification speedups over purely software-based differential fuzzers such as DifuzzRTL and Cascade.

Spike as a golden/reference model

The VPU environment used the RISC-V ISA simulator Spike for co-simulation inside the UVM environment. Spike had two main roles:

  • executing scalar instructions and providing vector instructions to UVM in program order; and
  • acting as the golden/reference model used to check DUT results.

To support these roles, Spike was modified with SystemVerilog Direct Programming Interface (DPI) functions, a method that resumes simulation until a vector instruction is executed and returns reference results to UVM, functions for reading Spike's memory, and a mechanism to force reduction results into Spike to avoid divergence in unordered floating-point reductions.

Scoreboard comparison flow

When Spike finds a vector instruction, it provides the instruction, reference results, and other relevant data to UVM. The instruction is packed as a transaction and sent to the issue agent, then executed by the VPU. The reference-model results are compared with the VPU-generated results. This makes the reference model a central oracle for result checking, while the scoreboard performs the actual comparison inside the UVM environment. The destination vector register value pulled from the reference model at instruction-completion time is the key datum compared against the VPU's observed value.

Handling legal model/DUT differences

A reference model may not always use the same legal algorithm as the DUT. The evidence describes this issue for unordered floating-point reductions: the VPU used a different reduction algorithm from Spike, which was allowed by the RVV specification. This caused occasional mismatches even when the VPU result was correct, and leaving the mismatched value in Spike registers could later cause additional divergence when that mismatched value was used by later instructions.

To address this, the verification team created an independent C reference model for unordered reductions. That model implemented the same exact reduction algorithm as the DUT. For those cases, the VPU result was compared against the C reduction reference model instead of Spike; if the result matched, the value was injected into Spike's register state to keep later execution aligned. This illustrates a common practical pattern: a primary reference model (an ISA simulator) may be paired with narrower, DUT-algorithm-matched reference models to resolve ambiguities permitted by the specification.

Automating reference-model construction

Reference models themselves are becoming more intricate and time-consuming to develop as integrated-circuit designs grow in complexity. ChatModel is an LLM-aided agile reference-model generation and verification platform that automates the transition from design specifications to fully functional reference models by integrating design standardization and hierarchical agile modeling. It employs a building-block generation strategy and, when evaluated on 300 designs of varying complexity, reported large efficiency and capacity gains over alternative generation methods. Such tools are increasingly relevant because the quality of the reference model directly bounds the bugs that a verification environment can detect.

See also

LINKED ENTITIES

2 links

CITATIONS

10 sources
10 citations
[1] A reference model is a software model that predicts how a design should behave based on inputs and generates the expected results used to evaluate the DUT. VPU Verification Environment (UPC repository PDF)
[2] In simulation-based functional verification, the design's actual behavior is verified by simulating the HDL description, driving stimuli, and comparing observed behavior to the expected behavior implied by the specification. AAAI 2006: Simulation-based Functional Verification of Hardware Designs
[3] DifuzzRTL improved upon RFUZZ by adding clock-sensitive optimization and incorporating a reference model, enabling better capture of state transitions and more effective checking of RTL execution results. Lyra: A Hardware-Accelerated RISC-V Verification Framework with Generative Model-Based Processor Fuzzing
[4] Lyra instantiates both the DUT and the software reference model on the same FPGA SoC, with the DUT on programmable logic and the reference model on hardened ARM processors, and uses dedicated hardware checkers for runtime differential checking of execution results. Lyra: A Hardware-Accelerated RISC-V Verification Framework with Generative Model-Based Processor Fuzzing
[5] In conventional verification, the three main phases (stimulus generation, test execution of both DUT and reference model, and coverage collection) are all performed in software, with throughput typically reaching only tens of kHz. Lyra: A Hardware-Accelerated RISC-V Verification Framework with Generative Model-Based Processor Fuzzing
[6] Spike was used as a golden/reference model to check DUT results, executing scalar instructions and providing vector instructions to UVM in program order, with DPI-callable functions used to drive and inspect the simulator from SystemVerilog. VPU Verification Environment (UPC repository PDF)
[7] A UVM scoreboard compared VPU results against the reference model and included the destination vector register value extracted from the reference model in the comparison. VPU Verification Environment (UPC repository PDF)
[8] An independent C reference model was created for unordered floating-point reductions because the VPU used a different legal algorithm than Spike; matched values were then injected into Spike's register state to keep later execution aligned. VPU Verification Environment (UPC repository PDF)
[9] ChatModel is an LLM-aided agile reference-model generation and verification platform that automates the transition from design specifications to functional reference models and reports large efficiency and capacity gains over alternative generation methods. ChatModel: Automating Reference Model Design and Verification with LLMs
[10] In software-architecture discourse, the term 'reference model' is also used for an abstract framework for system-interaction semantics that classifies interactions along dimensions of state, determinism, and synchronicity. A reference model for interaction semantics

VERSION HISTORY

v3 · 6/10/2026 · minimax/minimax-m3 (current)
v2 · 5/27/2026 · gpt-5.5
v1 · 5/25/2026 · gpt-5.5