Skip to content
STIMSMITH

Lazy Merging of Directed Streams

Technique WIKI v1 · 5/26/2026

Lazy Merging of Directed Streams is an algorithmic refactoring used in RISCV-DV to merge directed instruction streams into an initial randomized instruction dump without repeated queue insertions. The technique records intended injection locations first, chains multiple streams targeting the same location, and then builds the final merged instruction dump in one pass, reducing the merge process from O(n²) behavior to O(n) scaling.

Overview

Lazy Merging of Directed Streams is a technique for combining directed instruction streams with an initial randomized instruction dump in RISCV-DV. It was introduced as an algorithmic refactoring of a greedy merge process that did not scale well for large generated instruction counts. The refactored process avoids actual queue insertions during placement and instead delays materialization of the merged instruction list until all directed stream injection locations have been recorded.

Problem Addressed

The original RISCV-DV merge process inserted directed stream sequences greedily into the initial instruction dump. It selected a random injection location, checked whether the location fell inside an already inserted directed sequence, retried random selection up to ten times if necessary, and if a valid location was still not found, advanced to the boundary of the conflicting directed stream and inserted the new stream after it.

This approach required inserting slices of elements into a SystemVerilog queue. Because insertion can require moving elements before or after the insertion point, the cost grows with both the generated instruction count and the number of directed streams. The cited source characterizes the resulting behavior as O(n²) and notes that the generator did not scale well for instruction counts larger than one hundred thousand.

Technique

Lazy merging changes the merge operation from repeated insertion into a queue to a two-phase process:

  1. Mark injection locations: For each injectable directed stream, a random index is generated to identify where the stream should be injected into the initial randomized dump. No insertion is performed at this stage.
  2. Record stream placement metadata: The selected injection locations are enumerated in an array. If more than one directed stream targets the same location, the streams are connected through a linked-list-like structure using an insert_idx / next_stream field embedded in the stream object.
  3. Materialize the merged dump once: After all directed stream locations have been tabulated, the final merged instruction dump is constructed in a single pass. For each instruction position in the original instruction list, any directed streams recorded for that position are emitted first, followed by the original instruction.

Implementation Characteristics

The listing in the source shows a function mixin_dir_instr_list(riscv_instr_stream[] dir_list) that:

  • computes the original instruction count and final mixed instruction count,
  • stores directed stream placement entries in dir_n,
  • uses urandom(0, instr_count) to select an intended injection index,
  • chains streams that target the same index through next_stream, and
  • creates mixed_list by walking the original instruction list and copying any directed streams associated with each position before copying the original instruction.

The final step replaces the original instruction list with the constructed merged list using instr_list.assimilate(mixed_list).

Complexity and Scalability

The cited source states that this refactoring reduces the insertion process to O(n) by avoiding queue insertions and creating the merged function body in one pass. After re-profiling, the merge process was reported to scale well and no longer be a bottleneck for generating large test programs.

Context

The technique appears in the context of making RISCV-DV scalable for very large generated test programs. The source also notes that larger test programs can generate increasingly complex program flows, which are more likely to uncover processor bugs.

CITATIONS

7 sources
7 citations
[1] The original RISCV-DV merge process was greedy: it randomly selected a directed-stream injection location, retried up to ten times if the location was inside another directed sequence, and then advanced to a boundary if needed. [PDF] Crafting a Million Instructions/Sec RISCV-DV - DVCon Proceedings
[2] The greedy process involved inserting slices into a SystemVerilog queue, which required moving elements and led to O(n²) complexity as instruction counts and directed stream counts increased. [PDF] Crafting a Million Instructions/Sec RISCV-DV - DVCon Proceedings
[3] Lazy merging marks intended injection locations with a random index instead of immediately performing insertions. [PDF] Crafting a Million Instructions/Sec RISCV-DV - DVCon Proceedings
[4] When multiple directed streams target the same injection location, the method stores the information in a linked-list structure using an index embedded in the stream object. [PDF] Crafting a Million Instructions/Sec RISCV-DV - DVCon Proceedings
[5] After all injectable directed streams have been tabulated, the merged instruction dump is created in one pass by emitting directed stream instructions for each recorded position and then the original instruction. [PDF] Crafting a Million Instructions/Sec RISCV-DV - DVCon Proceedings
[6] The refactoring reduces the insertion process to O(n), and re-profiling found that the merge process scaled well and was no longer a bottleneck for large test-program generation. [PDF] Crafting a Million Instructions/Sec RISCV-DV - DVCon Proceedings
[7] The source notes that larger test programs can generate increasingly complex program flows that are more likely to uncover processor bugs. [PDF] Crafting a Million Instructions/Sec RISCV-DV - DVCon Proceedings