Skip to content
STIMSMITH

Execution Trace

Concept WIKI v1 · 6/10/2026

An execution trace is a structured, time-ordered record of the state observations and state changes produced as a system executes a sequence of operations. In RISC-V processor verification, an execution trace conforming to the RVFI format captures, for each executed instruction, the observed architectural state and the resulting state changes, enabling comparison against a reference (golden) model. More broadly, execution traces underpin program debugging, automated program repair, and the theoretical definition of algorithms as sets of behavioral histories.

Overview

An execution trace is a structured, time-ordered record of the state observations and state changes that occur as a system executes a sequence of operations. Execution traces are produced by instrumented hardware designs, ISA simulators, formal models, or instrumented software, and they are consumed by verification engines, debuggers, and program-repair systems in order to detect divergence between actual and expected behavior.

Execution Traces in RISC-V Processor Verification

The RVFI Execution Trace

In the RISC-V verification ecosystem, an execution trace is the per-instruction state record that an implementation delivers in response to a stream of injected instructions. As described in the TestRIG documentation:

"Implementations consume a DII instruction trace, and generate an RVFI execution trace." "The model and implementation return an RVFI-DII execution trace (etrace) that details the state observation and state change of each instruction."

The execution trace therefore captures, for each injected instruction, the architectural state read and the architectural state change resulting from that instruction, and is the artifact by which a verification engine detects disagreement between two implementations or between an implementation and a reference model.

TestRIG and the Verification Engine (Vengine)

In the TestRIG framework, vengines (verification engines) generate one or more DII (Direct Instruction Injection) streams of instruction traces, and consume one or more RVFI streams of execution traces. The vengine feeds identical instruction traces to both a reference model and an implementation under test over TCP sockets in the RVFI-DII format; each side returns an execution trace. The vengine compares the two execution traces and reports any divergence. A capable vengine additionally attempts to reduce the failing trace to a minimal counterexample.

Implementations that wish to be verified by TestRIG must support the RVFI-DII itrace format as an instruction source and the RVFI-DII etrace as the reporting format. In this mode, instructions are consumed exclusively from the RVFI-DII instruction port — bypassing any actual instruction memory and ignoring the architectural program counter — and a trace report is delivered at the end of execution detailing the implementation's behavior in response to that instruction. The RVFI-DII communication uses a single socket, with the itrace consumed and the etrace delivered over the same socket.

Generic Simulation-Based Trace Generation

A separate body of work (Gomes et al., "Large-Scale RISC-V Processor Verification Using Automated Design Inspection and a Generic Simulation Method") constructs execution traces generically by monitoring two processor interfaces:

  1. The instruction memory interface — capturing each fetch as a (program counter, instruction) pair.
  2. The register file write port — capturing each commit as a (target register, written value) triple whenever the write-enable signal is asserted.

The two streams of trace fragments are merged into a single execution trace by aligning them in order: the first register-writing instruction (e.g., an arithmetic instruction) is associated with the first register-file commit, the second with the second, and so on. Because branches do not write the register file, they are skipped during alignment. This strategy makes trace generation processor-agnostic — it does not depend on microarchitectural details of the pipeline or state machine. Memory write operations are deliberately excluded from the execution trace, because accurately monitoring byte and halfword store operations across heterogeneous memory interfaces (write-strobes vs. read-modify-write schemes) is difficult; instead, the test benchmark is constructed so that store errors are caught indirectly by a subsequent load from the same address.

Speculative Fetches and Trace Filtering

Pipelined processors speculatively fetch instructions following branches under a static "not-taken" branch prediction. To handle this, the control-flow fragment of the execution trace is filtered: any fetched instruction whose program counter does not follow the reference flow is discarded as a speculative (canceled) fetch. A subsequent merge step verifies whether the assumed-canceled instructions were truly canceled. An unexpected register-file commit from such an instruction is then associated with the wrong instruction, causing all subsequent associations to be misaligned, which is detected as a mismatch when the trace is compared against the reference model.

Tandem Verification via TestRIG

In the TestRIG ecosystem (Gray et al., "Randomized Testing of RISC-V CPUs using Direct Instruction Injection," IEEE Design & Test, 2023), the verification engine (VEngine) injects instruction sequences over sockets in RVFI-DII format and consumes the resulting RVFI execution traces. RVFI tracing has also been used with formal-verification tools (e.g., Cadence JasperGold) to attempt proof that a series of traces from a simple HDL model is equivalent to a series of traces from a pipelined HDL implementation, although such tools are currently limited to in-order pipelines. TestRIG instead uses directed-random test-sequence generation to compare execution traces between a golden model and a processor in development in order to refute, rather than prove, equivalence, and to produce automatically reduced counterexamples.

Execution Traces in Program Repair and Debugging

Execution traces are also used at the software level to localize divergence between observed and intended program behavior. In the TraceFixer approach (arXiv:2304.12743, 2023):

  • A developer identifies a divergence point in a partial execution trace — the place where actual program behavior first diverges from the desired behavior (e.g., a variable being assigned an incorrect value that then propagates through subsequent computation).
  • The trace is obtained automatically through code instrumentation; the desired correct state that the program should reach at the divergence point is supplied by the user (e.g., via an interactive debugger).
  • A neural program-repair model is trained to predict source-code edits that, given a partial execution trace and a target correct state, eliminate the divergence. Exploiting execution traces improved top-10 bug-fixing ability by 13% to 20% over a baseline that learned from source-code edits alone, and the system successfully fixed 10 of 20 real-world Python bugs in evaluation.

Theoretical Foundations

In theoretical computer science, the notion of an algorithm can be captured as a set of execution traces. This view can be reformulated in the general framework of small-step operational semantics and is, in part, independent of specific abstract-state-machine formalisms (Bezem et al., "Execution traces and reduction sequences," arXiv:2304.05039, 2023).

Relationship to the RVFI Stream

In RISC-V verification, individual execution traces are aggregated into RVFI streams. The TestRIG documentation states that vengines "consume one or more RVFI streams of execution traces," indicating that a single execution trace (the per-instruction state record returned by one implementation in response to one injected instruction stream) is a constituent element of an RVFI stream, with the stream carrying the ordered sequence of per-instruction state records produced by an implementation run.

Evidence Sources

  • TestRIG documentation (CTSRD-CHERI/TestRIG GitHub repository)
  • Gray, Gomes, et al., "Randomized Testing of RISC-V CPUs using Direct Instruction Injection" (IEEE Design & Test, 2023)
  • Gomes et al., "Large-Scale RISC-V Processor Verification Using Automated Design Inspection and a Generic Simulation Method" (SSCAD proceedings)
  • Gao et al., "TraceFixer: Execution Trace-Driven Program Repair" (arXiv:2304.12743, 2023)
  • Bezem et al., "Execution traces and reduction sequences" (arXiv:2304.05039, 2023)

LINKED ENTITIES

1 links

CITATIONS

8 sources
8 citations
[1] Implementations consume a DII instruction trace and generate an RVFI execution trace that details the state observation and state change of each instruction. CTSRD-CHERI/TestRIG
[2] Vengines generate one or more DII streams of instruction traces and consume one or more RVFI streams of execution traces. CTSRD-CHERI/TestRIG
[3] In RVFI-DII the itrace is consumed and the etrace is delivered over the same socket, with implementations bypassing any actual instruction memory and the architectural program counter when in DII mode. CTSRD-CHERI/TestRIG
[4] Generic execution-trace generation combines instruction-fetch fragments and register-file-commit fragments by aligning them in order, since both sequences maintain the same order, and excludes memory write operations. Large-Scale RISC-V Processor Verification Using Automated Design Inspection and a Generic Simulation Method
[5] Speculatively fetched instructions following branches are filtered from the execution trace under a static 'not-taken' prediction, and unexpected commits from canceled instructions are detected as a misaligned trace during comparison. Large-Scale RISC-V Processor Verification Using Automated Design Inspection and a Generic Simulation Method
[6] TestRIG's VEngine injects instruction sequences in RVFI-DII format and consumes RVFI execution traces, and a capable vengine reduces failing traces to minimal counterexamples. Randomized Testing of RISC-V CPUs using Direct Instruction Injection
[7] TraceFixer trains a neural program-repair model on partial execution traces and target correct states to predict source-code edits that eliminate divergence, with top-10 bug-fixing ability improved by 13–20% over an edits-only baseline. TraceFixer: Execution Trace-Driven Program Repair
[8] The notion of an algorithm as a set of execution traces can be reformulated in the general framework of small-step operational semantics, independently of the abstract-state-machine formalism. Execution traces and reduction sequences