Bare-Metal Test Generation
Overview
Bare-metal test generation is used in RISC-V processor verification to create software-driven stimulus that exercises architectural and system behavior directly on verification targets. The provided evidence describes this approach through STING, a bare-metal functional verification tool for RISC-V that generates constrained-random and directed tests. These tests are intended to be portable across simulation, emulation, FPGA prototypes, and silicon, and are self-checking to simplify debugging [STING bare-metal generator].
The need for this style of generation is tied to RISC-V verification complexity. The RISC-V ISA is modular and has many optional extensions, which increases the challenge of achieving comprehensive verification coverage [RISC-V verification complexity]. The evidence states that comprehensive coverage typically requires more than one verification or comparison methodology and more than one stimulus technique [RISC-V verification complexity].
Role in RISC-V verification
Bare-metal test generation supports a combined stimulus strategy:
- Constrained-random stimulus explores broad state spaces and can uncover unanticipated behaviors [random and directed strategy].
- Directed tests provide structure and can systematically target specific ISA features or coverage gaps [random and directed strategy].
- Combined random and directed stimulus is described as the most effective approach, with random testing used for breadth and directed suites used for precision [random and directed strategy].
The evidence warns that random testing alone can leave gaps. Features such as privilege-mode transitions, page-table walks, and memory protection may not be fully exercised by random generation alone [random-alone gaps]. Directed suites can address such features systematically, but may miss subtle corner-case interactions; therefore, the flow combines both techniques [random and directed strategy].
STING-based bare-metal generation
STING is described as a bare-metal, software-driven generator developed for RISC-V. It produces C++-based random streams and ASM-style directed tests, built on a lightweight kernel, libraries, and device drivers [STING architecture]. It also includes a programming framework for developing directed tests and uses stimulus graphs to control scheduling of both random and directed tests [STING architecture].
The generated programs are portable across multiple execution environments, including:
- RTL simulation,
- ZeBu emulation,
- HAPS FPGA prototypes,
- and silicon [portable stimulus].
The evidence also states that these programs are architecturally self-checking [portable stimulus]. This portability supports shift-left verification, where tests can begin in simulation and be reused in emulation, prototyping, and silicon to reduce late-stage risk [shift-left verification].
Verification targets and bug classes
Bare-metal generated tests are especially relevant for processor behaviors that are difficult to cover exhaustively with a single stimulus style. The evidence identifies STING as effective for stressing:
- privilege levels,
- memory protection,
- control and status registers,
- and hypervisor extensions [portable stimulus].
Reported issue classes exposed by STING include:
- deadlocks in page-table walks,
- mishandling of the
fence.iinstruction, - floating-point NaN quirks,
- and cache-coherence conflicts [reported STING findings].
The evidence also defines several RISC-V features and behaviors commonly relevant to such tests. PMP and ePMP restrict access to memory regions to enforce privilege, isolation, and security policies [PMP definition]. Sv39 and Sv48 are RISC-V virtual-memory schemes using 39-bit and 48-bit virtual addresses and multi-level page-table structures [Sv39 Sv48 definition]. Floating-point NaNs include signalling NaNs, which raise exceptions, and quiet NaNs, which propagate silently [NaN definition]. Cache-coherence conflicts involve multi-core cache situations where accesses to the same cache line can lead to stale data, corruption, or stalls if coherence is not enforced correctly [cache-coherence definition].
Coverage and closure
Bare-metal test generation is part of a broader coverage-closure process. The evidence defines coverage closure as achieving sufficient functional and code coverage to provide confidence that relevant design behaviors have been tested [coverage closure]. Functional Coverage and Stimulus Coverage measure how thoroughly stimulus has exercised ISA features and system behaviors [functional stimulus coverage].
The evidence also notes that automatically generated coverage models, such as ImperasFC and ImperasSC, can provide detailed insight into coverage gaps and integrate with Verdi [functional stimulus coverage]. Directed stimulus from STING and directed suites such as ImperasTS can then be used to address coverage gaps found during analysis [ImperasTS closure].
Comparison and debug flow
Bare-metal generated programs can be used with simulation and reference-model comparison flows. ImperasDV integrates fast RISC-V reference models and enables lock-step comparison of RTL against a golden reference model at instruction retirement [lock-step comparison]. Lock-step comparison is described as running RTL and a golden reference model in parallel and comparing results at instruction retirement for early bug detection [lock-step comparison].
The provided evidence also places bare-metal tests in a tool flow:
- VCS executes STING-generated random tests and ImperasTS directed suites to accelerate debug and coverage closure [VCS role].
- Verdi is used for waveforms, mismatch tracking, and functional coverage reporting [Verdi role].
- ZeBu emulation supports long software-driven tests, OS bring-up, and large-scale workloads [ZeBu role].
- HAPS prototyping supports pre-silicon software development, performance validation, and extended regression cycles [HAPS role].
Practical significance
In the provided evidence, bare-metal test generation is significant because it provides portable, self-checking stimulus that can be reused across multiple verification stages. It complements directed suites, coverage analysis, lock-step reference comparison, and debug platforms. For RISC-V, where optional ISA features and extensions increase verification complexity, the evidence supports using bare-metal constrained-random and directed generation as part of a combined strategy for discovery, targeted closure, and late-stage risk reduction.