Pipelined Processor

A pipelined processor is a processor implementation in which the behavior of implementation registers can depend on several instructions that are currently being processed by the pipeline. In the cited verification work, this is presented as a reason why the implementation state of a pipelined design is more complex than the programmer-visible architectural state.^[1]

Architectural state and mapping functions

For processor verification, an operation naturally corresponds to the execution of a single instruction. Each operation property describes how the internal state and output signals change when the processor executes that instruction. The design state is described in terms of a high-level or architectural state corresponding to the programmer's view of the visible registers.^[2]

In a pipelined processor, the relationship between architectural state and implementation registers is not direct. The evidence gives the example of a register file: implementation-register behavior may depend on multiple in-flight instructions, so a mapping function links the architectural register-file state to the implementation and captures the pipeline's forwarding logic. The same source notes that forwarding logic can be hidden in a mapping function, such as vreg, representing the architectural register file.^[1]

Role in instruction-set simulators

Instruction set simulators (ISSs) are used for early software development and testing before a processor is manufactured. Gate-level simulation can be cycle-accurate, but its performance is typically insufficient for in-depth software testing; more abstract ISA-based simulators can achieve several million instructions per second, but they risk diverging from the actual design if the ISA is reimplemented separately.^[3]

A cited ISS-generation approach derives a simulator from a complete property suite used for formal verification. In that approach, an architecture description mainly consists of an architectural state and a next_state function describing the effects of instructions and interrupts on that state. The authors reformulate operation properties into an architectural style to make ISS generation straightforward.^[4]

Pipeline details can affect simulator generation and performance. The ISS-generation paper notes that properties used for generation reflect hardware and pipeline effects that may not be present in a high-level ISA description, and that these effects can decrease simulation performance.^[5]

Example evaluated processors

The ISS-generation study evaluated a small pipelined processor with eight 16-bit registers, a special register for an interrupt return vector, a 5-stage pipeline, a simple data-memory interface, and an instruction set of seven instructions covering logic, arithmetic, memory access, and jumps. For this processor, the generated ISS reached 7 MIPS, compared with 0.22 MIPS for an interpretive simulator and 14 MIPS for a just-in-time compiled simulator from a commercial tool.^[6]

The same study also evaluated an industrial processor design with 64 32-bit registers in multiple hardware contexts, advanced processor features, a 7-stage pipeline, data-memory and bus interfaces, and 88 instructions based on the DLX instruction set architecture. Its processor core contained about 10,000 lines of VHDL, while the reformulated property suite contained about 2,000 lines of ITL. The generated ISS achieved 1.2 MIPS, compared with 2.5 MIPS for a just-in-time compiled simulator built using a commercial tool suite.^[7]

Modeling and synthesis approaches

A public arXiv summary describes Reduced Colored Petri Nets (RCPN) as a model intended to provide a simple and intuitive way to model pipelined processors and generate high-performance cycle-accurate simulators. The same summary states that RCPN processor models mirror processor pipeline block diagrams and reports about a 15× speedup over the SimpleScalar ARM simulator for generated cycle-accurate simulators for XScale and StrongArm models.^[8]

Another public arXiv summary frames pipelined-processor controller specification using the Burch-Dill paradigm: the visible behavior of a pipelined system should be identical to a non-pipelined reference system. That work reports synthesizing a controller for a simple two-stage pipelined processor using interpolation from an unsatisfiable SMT formula.^[9]

Key implications

Pipelining complicates the relation between implementation registers and architectural state because several instructions may be active in the pipeline at once.^[1]
Architectural-state abstractions and mapping functions can hide pipeline details such as forwarding logic while preserving an instruction-level view.^[1]
ISS generation for pipelined processors may need to account for hardware and pipeline effects that are absent from high-level ISA descriptions.^[5]
Formal and model-based techniques in the cited sources use architectural equivalence, property suites, or pipeline-structured models to reason about pipelined processors.^[4]^[8]^[9]

^[1]: See citation "Pipeline implementation state and forwarding logic". ^[2]: See citation "Operation properties and architectural state". ^[3]: See citation "ISS motivation and design divergence risk". ^[4]: See citation "Architectural-style ISS generation". ^[5]: See citation "Pipeline effects in generated ISS performance". ^[6]: See citation "Small 5-stage pipelined processor experiment". ^[7]: See citation "Industrial 7-stage pipelined processor experiment". ^[8]: See citation "RCPN modeling of pipelined processors". ^[9]: See citation "Controller synthesis for a two-stage pipelined processor".