Skip to content
STIMSMITH

Directed Instruction Stream Generation

Technique WIKI v1 · 5/26/2026

Directed Instruction Stream Generation is a RISCV-DV test-generation technique in which instruction sequences for execution scenarios that require coordinated constraints—such as loops, load/store hazards, and numeric computation exceptions—are generated and randomized as directed streams, then inserted into a non-directed randomized instruction dump while preserving each directed sequence intact.

Overview

Directed Instruction Stream Generation is used in RISCV-DV when certain RISC-V instruction categories cannot be meaningfully tested as isolated, unconstrained random instructions. RISCV-DV groups execution use-cases such as loop sequences, load/store hazards, and numeric computation exceptions into directed streams, whose instructions are constrained together as a sequence to create a desired execution scenario. These streams are generated, randomized, and inserted randomly into an initially generated dump of non-directed randomized instructions. During insertion, RISCV-DV avoids inserting one directed stream inside another directed stream already present in the dump. [C1]

Motivation

The technique addresses instruction categories that require sequencing. The cited paper gives jalr as an example: an unconstrained jump instruction could skip most executable test code, create an infinite loop, or jump outside the test program’s scope. Directed streams allow such scenarios to be constrained together so that the generated program remains meaningful for verification. [C1]

Generation flow

A directed stream generation flow in RISCV-DV includes the following steps:

  1. Configure instruction-generation parameters, including the ratio of different instruction-stream kinds to generate. [C2]
  2. Generate and randomize directed instruction streams for the main program and sub-programs according to user-specified ratios. [C3]
  3. Generate an initial dump of non-directed randomized instructions. [C1]
  4. Randomly insert the directed streams into the non-directed stream while preserving directed sequences as atomic regions and avoiding insertion inside an existing directed stream. [C1]
  5. Revisit the merged stream and annotate target locations for embedded jump instructions with appropriate labels. [C1]

Performance characteristics

Profiling of RISCV-DV identified directed stream creation and randomization as the highest-impact bottleneck among the listed bottlenecks. The cited analysis attributes most of the time in directed-stream generation to constraint-solver execution. [C3]

The insertion of directed streams into the non-directed instruction stream was identified as another major bottleneck. The original insertion process used a greedy approach: it randomly selected an injection location, and if that location fell inside an already inserted directed sequence, it selected another random location. This insertion bottleneck became more severe as instruction count increased and was characterized as having O(n²) algorithmic complexity. [C4]

Parallelized implementation pattern

The cited optimized implementation parallelizes randomization of directed instruction streams. In the shown implementation, generate_directed_instr_stream iterates over directed_instr_stream_ratio, computes an insertion count for each stream from original_instr_cnt * ratio / 1000, enforces a minimum insertion count, accumulates the total stream length, and spawns a fork that calls generate_directed_instr_stream_idx for that stream. Each fork is assigned thread affinity, all forks are joined, the expected stream length is asserted, and the final directed stream array is shuffled. [C5]

This parallelization targets a workload that is linear in algorithmic complexity and dominated by constraint solving; the paper states that the first two bottlenecks, including directed-stream generation and non-directed instruction generation, scale well with multicore parallelization. [C3]

Implementation notes

  • Directed streams represent multi-instruction scenarios that require constraints across the sequence rather than independent randomization. [C1]
  • Inserted directed sequences are treated as intact regions so later insertions do not split them. [C1]
  • The number of directed streams to insert can be derived from user-specified ratios and the original instruction count, with a minimum insertion count applied per stream type in the cited code. [C5]
  • Parallel generation can be organized per stream type, with independent forked calls producing segments of the directed stream array before shuffling. [C5]

CITATIONS

5 sources
5 citations
[1] C1: RISCV-DV groups execution use-cases such as loops, load/store hazards, and numeric computation exceptions into directed streams; these streams are constrained as sequences, generated, randomized, and inserted into non-directed randomized instructions while avoiding insertion inside another directed stream. [PDF] Crafting a Million Instructions/Sec RISCV-DV - DVCon Proceedings
[2] C2: RISCV-DV is highly customizable and supports command-line options including the ratio of various instruction-stream kinds to generate. [PDF] Crafting a Million Instructions/Sec RISCV-DV - DVCon Proceedings
[3] C3: Profiling identified directed instruction stream creation and randomization as the highest-impact bottleneck, with most time spent in constraint solvers; the first two listed bottlenecks are linear and handled with multicore parallelization. [PDF] Crafting a Million Instructions/Sec RISCV-DV - DVCon Proceedings
[4] C4: Directed-stream insertion into the non-directed stream was identified as a bottleneck with O(n²) complexity; the original merge process picked random insertion locations and retried when a location violated an existing directed sequence. [PDF] Crafting a Million Instructions/Sec RISCV-DV - DVCon Proceedings
[5] C5: Listing 10 shows parallelized directed instruction stream randomization using per-stream insertion counts derived from ratios, forked calls to generate_directed_instr_stream_idx, thread affinity, fork joins, a length assertion, and final shuffling. [PDF] Crafting a Million Instructions/Sec RISCV-DV - DVCon Proceedings