Efficient Cross-Level Testing for Processor Verification: A RISC-V Case-Study Wiki

Efficient Cross-Level Testing for Processor Verification: A RISC-V Case-Study

Paper WIKI v2 · 5/30/2026

“Efficient Cross-Level Testing for Processor Verification: A RISC-V Case-Study” presents an efficient RISC-V processor-verification approach based on unrestricted, generic on-the-fly instruction-stream generation and tight co-simulation between an Instruction Set Simulator and an RTL core. In the reported TGF-series RISC-V core case study, the approach found several serious bugs, with each described bug found in less than five minutes, and generated/co-simulated 226 million instructions in one hour.

Overview

Efficient Cross-Level Testing for Processor Verification: A RISC-V Case-Study is a paper on simulation-based processor verification for RISC-V RTL cores. The evidence describes an approach that avoids restrictions on generated instructions and uses a lightweight test-generation process together with tight co-simulation between an Instruction Set Simulator (ISS) and an RTL core.

Approach

The paper’s central technical idea is to combine generic on-the-fly instruction-stream generation with ISS/RTL co-simulation. The authors report that this lightweight generation process and tight coupling between the ISS and RTL core enable high execution throughput during verification.

Case study

The case study targets a pipelined industrial RISC-V TGF-series core. The approach was reported to be effective at finding several serious bugs in that core, and all of the described bugs were found in less than five minutes each.

Visible bug categories in the evidence include:

CSR access errors, such as writes to read-only CSRs not causing illegal-instruction traps, and some legal writes to non-read-only CSRs incorrectly causing exceptions.
Incorrect handling of MEPC lower bits, allowing software to write an unaligned address and potentially cause an unaligned jump.
Incorrect initialization and update behavior for MISA, including updates to unsupported values.
Incorrect MTVAL behavior for ECALL, where MTVAL should be set to zero but was instead set to the ECALL instruction encoding.
Allowing software to write a reserved value into the MODE field of MTVEC.
EBREAK setting MCAUSE to illegal instruction instead of breakpoint.
FENCE and FENCE.I causing illegal-instruction traps for specific options because of a decoder implementation issue.
Writes to MINSTRET and MCYCLE incorrectly causing illegal-instruction traps, even though these counter CSRs are allowed to be modified by software.
MINSTRET not being correctly updated on a write access.

Performance and execution metrics

In one reported one-hour run, the approach generated and co-simulated 226 million instructions. These consisted of 12 million illegal instructions and 214 million legal instructions. Among the legal instructions, 156 million completed normally and 58 million caused an exception or trap.

The paper reports an average throughput of 63 thousand instructions per second and 229 thousand RTL-core cycles per second. It also notes that the legal-instruction distribution was mostly uniform, with examples ranging from 6.0 million ADDI executions to 3.6 million MRET executions in the illustrated distribution.

Future work identified by the paper

The reported future-work directions include parallelized test sessions using different random seeds, FPGA-based acceleration, testing the RTL core’s interrupt interface, extending the method to additional RISC-V ISA extensions, and developing coverage metrics and execution-feedback mechanisms that also account for RTL-specific coverage.

LINKED ENTITIES

43 links

Vladimir Herdt AUTHORED_BY

Daniel Große AUTHORED_BY

Eyck Jentzsch AUTHORED_BY

Rolf Drechsler AUTHORED_BY

Cross-Level Testing INTRODUCES

on-the-fly instruction stream generation USES

Co-Simulation USES

Instruction Set Simulator (ISS) USES

MINRES The Good Folk (TGF) Series RTL core EVALUATES

RISC-V ISA USES

RTL verification USES

SystemC USES

TLM (Transaction Level Modeling) USES

Verilator USES

SpinalHDL USES

RISC-V VP USES

co-simulation testbench INTRODUCES

instruction generation algorithm INTRODUCES

opcode injection USES

instruction field mutation USES

instruction sequence generation USES

riscv-dv COMPARES_WITH

RISC-V Torture Test COMPARES_WITH

Coverage-Guided Fuzzing COMPARES_WITH

Formal Verification MENTIONS

riscv-formal MENTIONS

Simulation-Based Verification USES

Model-Based Test Generation MENTIONS

constraint-based test generation MENTIONS

bayesian network coverage-guided test generation MENTIONS

constrained-random test generation MENTIONS

pipeline USES

Control and Status Register (CSR) USES

trap handling USES

RV32I USES

Genesys-Pro: Innovations in Test Program Generation for Functional Processor Verification MENTIONS

MicroTESK: specification-based tool for constructing test program generators MENTIONS

Extensible and configurable RISC-V based virtual prototype MENTIONS

Towards specification and testing of RISC-V ISA compliance MENTIONS

Verifying Instruction Set Simulators using Coverage-guided Fuzzing MENTIONS

Closing the RISC-V compliance gap: Looking from the negative testing side MENTIONS

Instruction Stream Generation USES

DFKI GmbH AUTHORED_BY

CITATIONS

7 sources

7 citations

[1] The paper uses unrestricted or generic on-the-fly instruction-stream generation and ISS/RTL co-simulation. Efficient Cross-Level Testing for

[2] The case study concerned a pipelined industrial RISC-V TGF-series core and found several serious bugs. Efficient Cross-Level Testing for

[3] All described bugs were found in less than five minutes each. Efficient Cross-Level Testing for

[4] The visible bug list includes CSR access, MEPC, MISA, MTVAL/ECALL, MTVEC MODE, EBREAK/MCAUSE, FENCE/FENCE.I, MINSTRET/MCYCLE trap behavior, and MINSTRET update issues. Efficient Cross-Level Testing for

[5] In one hour, the approach generated and co-simulated 226 million instructions: 12 million illegal and 214 million legal, with 156 million legal instructions completing normally and 58 million causing an exception or trap. Efficient Cross-Level Testing for

[6] The reported average throughput was 63 thousand instructions per second and 229 thousand RTL-core cycles per second. Efficient Cross-Level Testing for

[7] The paper’s future work includes parallelized test sessions, FPGA acceleration, interrupt-interface testing, additional RISC-V ISA extensions, and RTL-aware coverage and feedback mechanisms. Efficient Cross-Level Testing for

VERSION HISTORY

v2 · 5/30/2026 · gpt-5.5 (current)

v1 · 5/25/2026 · gpt-5.5

Compare with: