Lazy Merging of Directed Streams Wiki

Overview

Lazy Merging of Directed Streams is a technique for combining directed instruction streams with an initial randomized instruction dump in RISCV-DV. It was introduced as an algorithmic refactoring of a greedy merge process that did not scale well for large generated instruction counts. The refactored process avoids actual queue insertions during placement and instead delays materialization of the merged instruction list until all directed stream injection locations have been recorded.

Problem Addressed

The original RISCV-DV merge process inserted directed stream sequences greedily into the initial instruction dump. It selected a random injection location, checked whether the location fell inside an already inserted directed sequence, retried random selection up to ten times if necessary, and if a valid location was still not found, advanced to the boundary of the conflicting directed stream and inserted the new stream after it.

This approach required inserting slices of elements into a SystemVerilog queue. Because insertion can require moving elements before or after the insertion point, the cost grows with both the generated instruction count and the number of directed streams. The cited source characterizes the resulting behavior as O(n²) and notes that the generator did not scale well for instruction counts larger than one hundred thousand.

Technique

Lazy merging changes the merge operation from repeated insertion into a queue to a two-phase process:

Mark injection locations: For each injectable directed stream, a random index is generated to identify where the stream should be injected into the initial randomized dump. No insertion is performed at this stage.
Record stream placement metadata: The selected injection locations are enumerated in an array. If more than one directed stream targets the same location, the streams are connected through a linked-list-like structure using an insert_idx / next_stream field embedded in the stream object.
Materialize the merged dump once: After all directed stream locations have been tabulated, the final merged instruction dump is constructed in a single pass. For each instruction position in the original instruction list, any directed streams recorded for that position are emitted first, followed by the original instruction.

Implementation Characteristics

The listing in the source shows a function mixin_dir_instr_list(riscv_instr_stream[] dir_list) that:

computes the original instruction count and final mixed instruction count,
stores directed stream placement entries in dir_n,
uses urandom(0, instr_count) to select an intended injection index,
chains streams that target the same index through next_stream, and
creates mixed_list by walking the original instruction list and copying any directed streams associated with each position before copying the original instruction.

The final step replaces the original instruction list with the constructed merged list using instr_list.assimilate(mixed_list).

Complexity and Scalability

The cited source states that this refactoring reduces the insertion process to O(n) by avoiding queue insertions and creating the merged function body in one pass. After re-profiling, the merge process was reported to scale well and no longer be a bottleneck for generating large test programs.

Context

The technique appears in the context of making RISCV-DV scalable for very large generated test programs. The source also notes that larger test programs can generate increasingly complex program flows, which are more likely to uncover processor bugs.