on-the-fly instruction stream generation Wiki

Overview

On-the-fly instruction stream generation is a processor-verification technique described as part of a cross-level testing approach. The approach generates an endless instruction stream during simulation and feeds that stream to both an RTL core and an instruction-set simulator (ISS) in a co-simulation testbench. The RTL instruction fetch path always generates a new instruction, even if the same program counter (PC) has already been fetched, while the ISS receives the corresponding instruction that was fetched by the RTL core. [claim: overview_and_cosimulation]

The instruction stream sits between the instruction generator and the memory interfaces. For RTL fetches, the stream calls the instruction generator, stores the generated instruction together with the PC in a pending-instruction queue, and returns the instruction. For ISS fetches, the stream searches for the matching instruction previously fetched by the RTL core. [claim: stream_placement_and_fetch_flow]

Testbench context

In the documented co-simulation testbench, the RTL core is clock-driven and has separate instruction-memory and data-memory interfaces. The memory interfaces translate between RTL signals and TLM transactions, giving the RTL core and ISS a common memory abstraction. The ISS also uses separate instruction and data memory interfaces. [claim: memory_interface_context]

The data memory is implemented lazily: it is initially empty, writes store data, and reads either return existing data or generate random data for previously unseen addresses. The RTL and ISS data memories use the same random seed, so they behave identically when the RTL core and ISS perform the same data-memory access sequence in the same order. [claim: lazy_data_memory]

Supported instruction behavior

The approach is described as generating an endless instruction stream "without restrictions on the generated instructions." In the cited implementation, this unrestricted generation supports all memory-access instructions because the complete address range of the data-memory interface is wrapped, supports jump instructions including self-loops, and supports special RISC-V CSR access instructions. This breadth is presented as enabling comprehensive testing. [claim: unrestricted_generation]

Independent of the generated instruction sequence, the expected verification condition is that the ISS and RTL core behave identically on observable architectural state, such as register updates. [claim: architectural_state_comparison]

Instruction matching problem

Feeding the same dynamically generated stream to a pipelined RTL core and an ISS requires special handling. One challenge is detecting completed instructions in the RTL core: the industrial pipelined RTL core discussed in the evidence does not provide a single signal that directly identifies completed instructions. Illegal instructions may bypass pipeline stages and may not trigger normal register-writeback notifications; legal instructions ahead of an illegal instruction must still complete first; and flushes, traps, jumps, and multi-cycle operations can introduce delays and gaps. [claim: completion_detection_challenge]

A second challenge is matching instruction fetches between the RTL core and ISS. Direct matching by PC is insufficient because the RTL core can prefetch or fetch again before the ISS has reached the same point. The evidence gives the example of a one-instruction backward jump from address 8 to address 4: the RTL core can execute the jump and begin prefetching from the target before the jump has fully completed, causing a new instruction to be generated for an address before the ISS has had the opportunity to execute the jump. [claim: pc_matching_problem]

Pending-instruction queue solution

The described solution keeps a queue of pending instructions in fetch order. Each RTL fetch generates an instruction and stores the pair (PC, instruction) in the queue. For the ISS, the instruction stream is queried with the ISS PC and the last completed RTL instruction. The stream searches the pending queue for an entry whose PC and expected instruction match; if a match is found, that instruction is returned, and otherwise a mismatch is reported. [claim: pending_queue_solution]

This matching mechanism allows the generated-on-demand instruction stream to be shared consistently between a pipelined RTL implementation and the ISS even when prefetching, jumps, traps, or pipeline effects make simple PC-based matching unreliable. [claim: matching_purpose]