CI/CD Infrastructure Wiki

Overview

In the context of Functional Verification of a RISC-V Vector Accelerator, CI/CD Infrastructure denotes the automation layer used to generate tests, run simulations, classify failures, collect coverage, and gate changes to the device-under-test (DUT) RTL and verification environment. The paper describes it as part of an industrial-grade verification effort for a RISC-V vector accelerator and reports that the overall automated constrained-random generation, simulation, error reporting, and CI/CD infrastructure helped find 3005 errors and reach 95.79% functional coverage. [c1]

Although the paper uses the term CI/CD infrastructure, the detailed implementation described in the evidence focuses primarily on continuous-integration activities: automated test generation, execution, regression selection, coverage collection, and pre-merge/periodic regression execution. [c2]

Implementation stack

The CI infrastructure was built using the open-source CI server Jenkins. The authors created multiple interacting pipelines intended to keep the design as error-free as possible. [c3]

The project also used GitLab for version control and issue tracking. Documentation for running simulations and using the environment was added through GitLab Wiki guides and tutorials so that project members could reproduce and operate the verification flow. [c4]

Pipeline structure

The Jenkins-based infrastructure included four main pipeline types: [c5]

New tests — Generated random tests with RISCV-DV, compiled the DUT, executed the generated binaries, and classified test results into passed and failed directories. Passing tests were used to build a regression set, while failing tests were retained for debugging until the associated error was fixed. [c5]
Retry — Re-executed previously failed tests after each change to the DUT repository main branch, then reclassified them as passed or failed. [c5]
Selection — Ran daily at midnight. If the number of passing tests exceeded a configured threshold, tests were ranked by collected coverage and split into two regression sets: a large set and a small set. [c5]
Regressions — Ran the small regression set for DUT changes that were candidates for merge, and ran the large regression set weekly to check that recent changes had not broken known-good tests. [c6]

Coverage and quality feedback

The CI flow collected functional coverage, assertion usage information, and code coverage from simulations. This made the automated runs part of both test generation and coverage closure, not only pass/fail checking. [c7]

When an error was found, the verification flow provided reproduction information such as the failing binary and faulty instruction. The team also maintained a table of active errors to support focused debugging, for example by grouping erroneous tests with the same instruction mnemonic, vector length, or element width. After a tentative fix, regressions were run before changes were merged. [c8]

Nightly and periodic runs

During the reported verification campaign, the team ran nightly simulations. The paper reports 24 tests per night between April and July, then 50 tests per night from August until the end of November, which marked RTL freeze before chip tape-out. Each test contained approximately 500 vector instructions. [c9]

The authors state that the CI infrastructure played an essential role in code health, maintainability, and coverage closure. In the experimental results, they further report that the CI pipelines allowed both RTL design and verification teams to test new features and find new errors. [c10]