Skip to content
STIMSMITH

Instruction Fetch Unit

Concept WIKI v1 · 5/28/2026

An Instruction Fetch Unit (IF unit) is the processor front-end subsystem that fetches instructions from the instruction cache and predicts the next program counter. In the cited two-way RISC-V superscalar out-of-order processor, the IF unit fetches two instructions per cycle and includes a dynamic branch-prediction structure built from a BHT, BTB, and RAS.

Overview

The Instruction Fetch Unit (IF unit) is a front-end processor subsystem responsible for fetching instructions from the instruction cache and predicting the next Program Counter (PC) address, i.e., the address of the next instruction to fetch. In the cited two-way RISC-V superscalar out-of-order processor, the IF subsystem is part of the processor front end, comprises the instruction cache and a dynamic predictor module, and fetches two instructions per cycle for forwarding to the Instruction Decode stage.

Role in a superscalar out-of-order pipeline

In the referenced two-way RISC-V superscalar out-of-order design, the front end fetches and decodes instructions before sending them to the back end for execution and retirement. The IF stage supplies up to two fetched instructions per cycle to the Instruction Decode stage, which can also decode two instructions per cycle. This makes the IF unit a throughput-critical front-end component: it must provide instruction bytes and next-PC predictions quickly enough to keep the downstream decode, rename, issue, and execution stages supplied.

Branch-prediction structures

The IF unit’s dynamic predictor includes three main structures:

  • Branch History Table (BHT): maintains history for previous occurrences of branches and predicts branch direction, i.e., taken or not taken. The cited design uses a GShare indexing scheme.
  • Branch Target Buffer (BTB): records target PC addresses for branch instructions, accelerating the determination of branch-taken addresses.
  • Return Address Stack (RAS): stores return addresses for decoded function calls; when a function-return instruction is encountered, the popped RAS entry is used as the next predicted PC address.

Together, these structures allow the IF unit to select likely next fetch addresses before branches are fully resolved later in the pipeline.

Interfaces used in verification

The cited verification work models the IF unit as connected to the rest of the processor through four separate interfaces. During simulation, each interface is driven by a distinct test sequence intended to mimic how the IF unit behaves when connected to the remaining processor subsystems while executing real programs.

The documented interfaces include:

  • Predictor Update interface: updated by the execution stage with the resolved status of branches; its test parameters include backward-branch-taken and forward-branch-taken probabilities.
  • Decode interface: connects the IF unit to the Decode stage, informs the IF unit about validity and instruction type, including function calls and returns, and triggers branch-prediction restart events.
  • Pipeline Flush interface: issues a flush when a branch is mispredicted; its parameters include branch-misprediction and branch-instruction rates.
  • Instruction Cache interface: fetches two instructions from the instruction cache for a current PC address; its parameters include partial-access and miss rates.

Verification concerns

The IF unit has several verification targets because it combines instruction-cache access, branch prediction, restart behavior, and interaction with downstream pipeline backpressure. The cited UVM-based verification work defines coverpoints for conditions such as:

  • both fetched instructions predicted as branch-taken or branch-not-taken;
  • mixed branch predictions across two fetched instructions;
  • writes, reads, overflows, and underflows in BHT structures;
  • reads and writes across BTB entries;
  • BTB full and empty states;
  • RAS full, empty, overflow, and underflow states;
  • restart-event and half-access finite-state-machine transitions.

The same work applies a multi-armed-bandit-driven verification flow to the IF unit. In that flow, virtual test sequences combine one sequence per IF-unit interface, and the UCB1 algorithm is used to balance exploration and exploitation when selecting sequences for simulation.

CITATIONS

8 sources
8 citations
[1] In the cited two-way RISC-V superscalar out-of-order processor, the IF subsystem is part of the front end, comprises the instruction cache and a dynamic predictor module, and fetches two instructions per cycle for the Instruction Decode stage. [PDF] UVM-based verification of RISC-V superscalar processors
[2] The IF unit is responsible for fetching instructions from the instruction cache and predicting the next PC address. [PDF] UVM-based verification of RISC-V superscalar processors
[3] The IF unit's dynamic predictor includes a BHT, BTB, and RAS; the BHT predicts branch direction, the BTB records branch target PC addresses, and the RAS supplies predicted return PCs for function returns. [PDF] UVM-based verification of RISC-V superscalar processors
[4] During simulation, the IF unit is modeled with four separate interfaces, each driven by a distinct test sequence that mimics behavior when connected to the rest of the processor. [PDF] UVM-based verification of RISC-V superscalar processors
[5] The Predictor Update interface receives resolved branch status from the execution stage, and the Decode interface provides validity and instruction-type information and triggers branch-prediction restart events. [PDF] UVM-based verification of RISC-V superscalar processors
[6] The Pipeline Flush interface issues a flush on branch misprediction, and the Instruction Cache interface fetches two instructions from the instruction cache for a current PC address. [PDF] UVM-based verification of RISC-V superscalar processors
[7] Representative IF-unit coverpoints include branch-prediction combinations, BHT reads and writes, BTB reads and writes and full/empty states, RAS full/empty/overflow/underflow states, and restart-event and half-access FSM transitions. [PDF] UVM-based verification of RISC-V superscalar processors
[8] The cited verification flow applies a multi-armed-bandit approach to IF-unit verification and uses UCB1 to balance exploration and exploitation when selecting test sequences. [PDF] UVM-based verification of RISC-V superscalar processors