Overview
eUVM is presented as a verification environment built on the D Programming Language and used in an optimized RISCV-DV port. The cited work contrasts eUVM with SystemVerilog/UVM in the context of testbench performance, especially where SystemVerilog testbenches lack user-level language constructs for synchronized shared-data access and where multicore simulator support is largely aimed at RTL or gate-level simulation rather than behavioral testbench code. [C1]
Role in RISCV-DV optimization
The paper describes an eUVM RISCV-DV port that changes the RISCV-DV architecture to better fit D-language concurrency semantics. In the original SystemVerilog code, some instruction-registry data is statically scoped in riscv_instr.sv. The eUVM port refactors those variables and related functions into a separate riscv_instr_registry class, then instantiates that registry inside the singleton riscv_instr_gen_config class to preserve singleton-like behavior while avoiding problematic global/static shared state in concurrent software. [C2]
Multicore execution model
In eUVM, the fundamental unit of testbench execution is a process, similar to SystemVerilog. A process can be declared as a task or forked from an existing task using the eUVM fork construct. The cited paper states that eUVM differs from SystemVerilog by being capable of executing threads on multiple cores and by allowing a newly forked process to be delegated to a specified processor thread. [C3]
For large RISCV-DV instruction sequences, eUVM uses a parallelized fork strategy. The optimized generator decides whether to parallelize based on instruction count, with a default threshold of 4000 in the par_instr_threshold configuration parameter. When the threshold is exceeded, it splits instruction randomization into par_num_threads slices, with the default thread count given as 8, and randomizes each slice in a separate thread. [C4]
The eUVM fork construct returns a Fork object, which can be stored in a list, configured, and joined later. The set_thread_affinity method assigns a fork to a specific execution thread. [C5]
Directed instruction-stream parallelization
For directed instruction streams, the cited implementation uses a different strategy: because there are multiple groups of directed streams, a separate thread is designated for randomizing each group. The listing shows creation of Fork objects, use of set_thread_affinity, joining of all forks, and shuffling of the resulting stream. [C6]
Profiling support
eUVM includes a uvm_trace construct used for macro-level profiling and formal identification of testbench bottlenecks. The paper cautions that each uvm_trace invocation performs an operating-system call to fetch the current clock time, so excessive use can significantly increase runtime. [C7]
Runtime optimization techniques
The paper describes several eUVM-oriented implementation techniques for reducing runtime overhead:
- Efficient shallow copy: eUVM implements shallow copy by using D object introspection to determine the memory footprint of an object and then copying the relevant memory slice. The paper notes that this becomes a single
memcopyoperation and is more efficient than copying individual class elements through UVM utility copy constructs. [C8] - Reduced memory allocation: eUVM avoids some trivial allocations by using D’s
sformat, which lets the user supply scratch memory for formatted output. In the example, a fixed-size character buffer is used for formatting a 32-bit immediate value, reducing calls tomallocby half compared with an allocation-returning string-formatting approach. [C9]
Context and motivation
The motivation for eUVM in the cited work is testbench performance. The paper states that SystemVerilog/UVM RISCV-DV execution is limited by complex constraint solving and sub-optimal algorithmic implementation, and that SystemVerilog lacks native data types, requiring DPI-based C/C++ interfacing for emulation-platform integration. It also states that computational algorithms written in SystemVerilog execute about an order of magnitude slower than corresponding C/C++ or other native-language implementations because SystemVerilog integral variables and expressions carry value-change event semantics. [C10]