Overview
RISC-V ISA Tests are a RISC-V test suite cited by Verifying Instruction Set Simulators using Coverage-guided Fuzzing with the repository URL https://github.com/riscv/riscv-tests. In that study, the suite is treated as a set of directed tests that are hand-written and therefore do not require a test-generation step.
Use in simulator verification
The paper uses RISC-V ISA Tests as one of the evaluated test sets for instruction-set-simulator verification. The evaluation table compares:
- T1: RISC-V ISA Tests
- RISC-V Torture-generated test sets
- A coverage-guided fuzzing approach
For the RISC-V ISA Tests row, the paper reports a total time of 2 seconds for the test-set evaluation context, while noting that these tests do not require generation because they are hand-written directed tests.
Reported coverage and detected errors
In the paper's evaluation, coverage for RISC-V ISA Tests was measured by instrumenting the ISS under test. The reported values for T1: RISC-V ISA Tests are:
| Metric | Reported value |
|---|---|
| Branch coverage | 90.24% |
| Functional coverage R1 | 58.57% |
| Functional coverage R2 | 61.70% |
| Functional coverage R3 | 50.00% |
| V(RS1) | 14.29% |
| V(RS2) | 2.70% |
| V(RD) | 7.55% |
| V(I imm) | 8.33% |
| V(I shmt) | 100.00% |
The same table reports that RISC-V ISA Tests found [V1..V3] errors in the ISS under test. The article does not restate detailed causes for V1, V2, or V3 here because the provided evidence only includes the table entry, not the paper text describing those individual defects.
Relationship to coverage-guided fuzzing
The paper positions the RISC-V ISA Tests as a directed-test baseline in an evaluation of coverage-guided fuzzing for RISC-V instruction-set simulators. Its conclusion states that the authors implemented a coverage-guided fuzzing approach on top of LLVM libFuzzer, evaluated it on three publicly available RISC-V ISSs, and found fuzzing effective for maximizing most coverage metrics and for finding errors. In that comparison, RISC-V ISA Tests were compact and fast, but the table shows that the fuzzing approach reached higher reported coverage across the listed metrics and found more of the labeled ISS-under-test errors.