Definition
Micro-architecture verification is the CPU verification activity that checks the functional correctness of a processor's internal logical building blocks—such as the fetch unit, execution unit, load/store unit, and cache—rather than only its externally visible instruction-set behavior. It is typically carried out as an IP- or block-level verification plan that sits between architectural compliance and full-chip integration within a broader CPU verification effort.
Role in CPU verification
A single verification plan is usually not sufficient for a CPU. The cited RISC-V CPU verification material distinguishes at least three planning areas:
- Architecture verification, which checks compliance with the instruction-set architecture and software-visible behavior (instructions, modes, memory management, interrupts, interfaces).
- Micro-architecture verification, which checks the CPU's logical implementation blocks and their detailed features.
- Performance verification, which uses patterns or benchmarks (e.g. specint, lmbench, dhrystone) to measure performance aspects and bottlenecks.
Micro-architecture verification can therefore be viewed as the IP- or block-level verification plan within the overall CPU verification effort.
Test planning
For each major micro-architectural block, the verification plan should list the relevant features and describe how those features will be verified, capturing both what to verify and how to verify:
- stimulus infrastructure;
- controls or "knobs" for randomization;
- sequences and tests;
- coverage properties used to assess stimulus quality;
- checking mechanisms for functional correctness;
- testbench components, hierarchy, and stimulus patterns.
A block diagram of the various testbench components, hierarchy, and stimulus patterns should be captured and explained well so that this can be translated into implementation with fewer issues later in the execution. The cited RISC-V material states that understanding both the CPU architecture and micro-architecture is a starting point for deciding what must be verified, ensuring that the logic and building blocks behave functionally correctly, and choosing suitable verification methodologies.
Methods and checking mechanisms
The cited RISC-V material describes simulation with constrained-random or coverage-driven approaches as a common verification strategy. Selective areas of design and special features may be tested using formal verification or other techniques.
Checking mechanisms may be implemented as:
- scoreboards;
- interface assertions;
- embedded assertions inside RTL;
- embedded assertions inside verification components.
Modern verification processes use object-oriented programming concepts for efficiency and reusability, and the same source identifies SystemVerilog as the language of choice for verification tasks in this context.
Sub-units that are targeted
The semiengineering.com coverage of RISC-V micro-architectural verification specifically identifies the following processor sub-units as typical targets:
- branch prediction;
- parts of a pipeline;
- memory systems such as caches;
- prefetch buffer;
- ALUs;
- register models;
- multipliers;
- load/store unit.
For these sub-units, properties can be captured as a vocabulary of commands, and a generator can build a sequence of those commands. The generator keeps extending the sequence until it finds one that breaks behavior with respect to a golden reference model, and then shrinks it by removing commands that do not affect its ability to reproduce the bug. This approach is described as a big benefit not only for finding bugs but for diagnosing, debugging, and fixing them, and tends to work very well for a large class of sub-units.
Two ways to drive micro-architectural verification formally
Per Ashish Darbari (Axiomise), micro-architectural verification is typically pursued in two ways:
- Architectural assertions and covers failing in formal. RTL implementation can cause architectural violations that are picked up automatically as functional bugs—or even as safety or security issues via the confidentiality-integrity-availability (CIA) triad.
- Showering checks and covers across RTL interfaces. Formal tools pick up failures across different functional design components. This increases bug hunting, helps proof convergence via compositional reasoning, and improves overall coverage.
The cited material notes that formal tools are useful because, fundamentally, they exercise every possible combination of inputs to break the ISA-specified behavior, which is generally captured as SystemVerilog assertions.
Formal vs. simulation by design region
A common split reported in the RISC-V coverage is:
- Control paths present challenges for simulation, and formal is often preferred.
- Data paths are at a totally different scale, and simulation is not close to exhaustive; properties are written and verified with formal instead.
- Constrained-random simulation with coverage can also be effective, but carries the risk of leaving corner cases that formal would catch.
Golden-model comparison and its limits
A common processor verification approach is to compare what an implementation does against a golden model. The cited industry commentary argues that simply comparing instruction traces starts to have real issues when asynchronous events, multi-issue pipelines, or out-of-order execution are introduced. The ISA specification is not precise in every aspect—for example, it does not say what happens when six interrupts of the same priority all happen at once, which one the micro-architecture chooses to take, or at which stage in the pipeline. These are implementation choices made in the RTL, in the pipeline, in the micro-architecture, and will differ from core to core.
A practical workaround for timer-interrupt desynchronization between the reference model and the DUT is to remove the timer from the equation: if both models are good, the same number of instructions will be executed, so a timer interrupt may be triggered every 5,000 retired instructions rather than every 1 million clock cycles.
RISC-V-specific challenges
RISC-V's flexibility and extensibility amplify the verification problem. The cited coverage states that, in the past year, several new extensions were announced by RISC-V International, and users are encouraged to make their own extensions and modifications. There are also many ways in which a core can be implemented, with the specification often leaving details open. However, while it may be a relatively quick and easy process to develop these extensions, it is not so easy to verify them.
The cited industry experts argue that just because there are no license royalties to pay, RISC-V is not the cheap option—there can be no shortcuts for verification if you want to be successful with RISC-V. Every custom feature added roughly doubles verification effort and complexity: it is easy to add things, but very hard to ship them in high quality. Each addition requires fully re-verifying the design, taking into account effects on the pipeline, conflicts in the ALU, issues with the caching system, and load/store interactions. Speculative and out-of-order execution techniques that are beginning to appear in RISC-V implementations also raise security concerns such as Spectre and Meltdown.
Coverage
Coverage is necessary but not sufficient. The cited commentary states that coverage tells you that you have done something and gives a certain level of confidence, but does not guarantee that there will not be problems; with the complexity of modern processors, this coverage is not going to be sufficient.
Processor coverage is described as unique—"everybody is talking a different language when they're talking about coverage." It is easy to go through and touch every one of the 432 million instruction variants that exist, but at that point, the coverage only says that the decoder was tested, not sequences of instructions, combinations, or what might happen to a pipeline. Generating billions of instructions is therefore not enough; better practice is to sit down with the designer, talk about the pipeline, identify the things they are really worried about, and focus on combinations of instructions that are more dangerous than others.
Integration and the path to booting Linux
After sub-units are verified, they can be integrated. The cited commentary notes that the last thing you want is to find an ALU bug while booting Linux. Once integrated, a more mixed verification strategy is typically required:
- Formal is useful because formal tools exercise every possible combination of inputs to break ISA-specified behavior, which is generally captured as SystemVerilog assertions.
- Major processor vendors also have extensive verification suites, including UVM testbenches and test software.
- Emulation is necessary for complete verification of all the elements of a large processor and to ensure correct behavior integrated into an SoC, while also allowing test software to be executed on the processor under test.
- Hardware-assisted verification—such as virtual-prototype capabilities, emulation, and hardware prototyping—is described as critical for understanding how micro-architectural decisions affect the full SoC and the workloads running on them, particularly with new custom instructions and items such as vector extensions.
The cited industry voices also note that booting Linux is itself a powerful form of verification: it is "amazing how many bugs you can have in a core and still boot Linux," and that bugs found this way often do not show up in other kinds of verification, including lots of asynchronous effects, timers going off, and even timing-base differences between simulation and FPGA-based emulation.
Custom instructions and extensions
The cited RISC-V coverage observes that new custom instructions and items such as vector extensions are being introduced, and it is important to know how micro-architectural decisions affect the full SoC and the workloads running on them. Each added feature has to be completely re-verified—and then some—including its effect on the rest of the design, especially when it changes things in the pipeline, conflicts in the ALU, issues with the caching system, or load/store logic.