SOURCE ARCHIVE
EXTRACTED CONTENT
61,009 charsAutomated Formal Verification of Processors
Based on Architectural Models
Ulrich K¨uhne∗ Sven Beyer† J¨org Bormann† John Barstow LSV ENS de Cachan OneSpin Solutions GmbH Abstract RT Solutions GmbH Infineon Technologies Ltd Cachan, France Munich, Germany Munich, Germany Bristol, UK
Abstract— processor pipeline, or an arbitration cycle in an arbiter. IPC
To keep up with the growing complexity of digital systems, high has been extended with further proofs which ensure that a set
level models are used in the design process. In today’s processor of properties verifies all input/output behavior of a circuit [5].
design, a comprehensive tool chain can be built automatically This methodology has already been used in industrial context
from architectural or transaction level models, but disregarding
formal verification. We present an approach to automatically for the verification of a wide variety of designs [12] including
generate a complete property suite from an architecture de- small or medium size processors [6].
scription, that can be used to formally verify a register transfer However, these projects also illustrate that the integration
level (RTL) implementation of a processor. The property suite of a thorough formal examination into industrial verification
is complete by construction, i.e. an exhaustive verification of all practice requires larger changes to the education and opin-
the functionality of the processor is ensured by the method. It
allows for the efficient verification of single pipeline processors, ions of verification engineers. Compared to simulation based
including several advanced processor features like multicycle approaches, formal verification requires a deep knowledge
instructions. At the same time, the structured approach reduces of the internals of the design under verification (DUV) in
the effort for verification significantly compared to a manual order to write assertions. An important motivation of the work
complete formal verification. The presented techniques have been summarized here and presented in [20] is therefore, that the
implemented in the tool FISACO, which is demonstrated on an
industrial processor. automation of the formal verification of some well defined
I. INTRODUCTION class of circuits eases the migration from simulation to formal
verification and hence helps to introduce this technology. We
The complexity of digital hardware systems has shown an chose smaller single pipeline processors as this class.
exponential growth over the last decades and it is growing For processors, a structured manual verification flow is
still. To keep track of large systems during the design process, available today [2]. But, automation of the verification is
high level models are used increasingly. Especially for the quite low, the more comprehensive the verification is. On the
design of processors, architecture or transaction level models other hand, existing approaches for the automatic verification
form the core of an elaborate tool chain that enables the of processors (see related work in Sect. II) often require a
automatic generation of simulators, assemblers or compilers, background of deep and general insight into verification goals
like Facile [27] or LISA [9]. However, formal verification of and correctness criteria.
the functionality of the design is still not part of this tool chain. In this paper we present a technique for the automatic
There exist several techniques for the verification of hard- generation of a complete property suite for processors. The
ware designs. In simulation based verification, the outputs of starting point of the approach is an architecture description
the implementation are compared to a golden reference model, of the processor. By defining a number of mapping functions
that is usually based on a transaction level description. But, the user captures how the abstract concepts are mapped to the
simulation is not well suited to cover the whole functionality register transfer level (RTL) implementation. These mapping
of a pipelined processor because achieving a sufficient design functions refer to pipeline stages, stall and cancel signals,
quality for such a processor requires a huge simulation-based and similar objects that design and verification engineers are
verification effort and there is no guarantee that all possible familiar with. Following this approach, the specification is
bugs have been considered. In contrast, formal techniques offer captured in a concise and readable form, while the underlying
the highest quality of verification [15]. general processor model enables the verification of several
One successful technique is Interval Property Checking advanced processor features like multicycle instructions, out- (IPC) [23], a technique similar to Bounded Model Checking of-order termination as well as exceptions. The generated [3]. IPC is used to check if a system satisfies a set of properties property suite is complete by construction in the sense of [5]. about the operations of a design like the processing of a As a driving verification engine, the OneSpin 360 MV tool request in a bus bridge, the execution of an instruction in a [24] is used, offering the performance and capacity to formally verify whole processor designs. This work was carried out jointly by the group for computer architecture The main contribution of this work is a well structured yet at University of Bremen∗, Germany and OneSpin Solutions GmbH†. It was supported in part by the German Federal Ministry of Education and Research pragmatic approach to tackle the formal verification of pro- (BMBF) within the project VerisoftXT under contract no. 01 IS 07008 C. cessors. It offers an exhaustive verification for a certain class
of designs, while the automatic generation of the properties or properties from simulation traces. However, they are only
increases the verification productivity significantly compared suited for an initial design understanding rather than for a veri-
to manual coding of properties. As the input for the approach fication against a specification. In contrast, our approach starts
is an abstract architecture description, the method can easily be with a specification that is then related to the implementation
integrated with existing tool chains for processor design. The in a well structured way.
automatic generation of the properties is implemented in the
tool FISACO. The approach is demonstrated with an industrial III. FORMAL VERIFICATION SETTING
control processor used in embedded automotive systems, a Within the last two decades, there has been a lot of research
domain with particularly high quality requirements. in formal verification techniques. Methods based on Boolean
The paper is structured as follows. Related work is discussed satisfiability (SAT) have proven to be a robust solution. One
in Sect. II. In Sect. III, our formal verification techniques are prominent technique is SAT based Bounded Model Checking
reviewed. The automatic property generation is described in (BMC), that has first been described in [3]. Successive im-
Sect. IV. Sect. V shows the application of the approach to an provements in performance have made BMC a suitable method
industrial processor design. Sect. VI concludes the work. for the formal verification of larger scale designs. For the work
II. RELATED WORK at hand, we use the techniques described in [23], referred
An approach for the automatic equivalence verification of to as interval property checking (IPC). In the following, this
general transaction level models (TLM) with timed imple- verification methodology will be briefly outlined.
mentations is presented in [4], where the different levels of In contrast to BMC, only safety properties are verified using
abstraction are related by events. However, the lowest level of IPC. As digital circuits always have a finite response time,
abstraction in [4] is behavioral RTL and it is not clear how this is not a serious restriction in practice. It is rather natural
the concept of events relates to optimized pipeline designs. to capture the specification of a design in terms of safety
In other words, an automated equivalence check between a properties. Furthermore, using IPC, these properties can be
sequential processor architecture and a pipelined RTL imple- verified with bounded proofs, which can be checked efficiently
mentation is not feasible for optimized industrial designs. using a SAT solver.
There has been work on the formalization of pipelined The main idea of IPC is to use an arbitrary starting state
designs. Part of the approaches in the literature use formal instead of the initial state used in BMC. Any property that
models for the automatic design of correct pipelines [19], or holds starting from an arbitrary state then also holds from any
to accompany the design process [10], [16]. In [10], starting reachable state and thus, it is exhaustively verified. Conversely,
from a simple model, the design is incrementally refined until false negatives can occur in IPC, i.e. counterexamples for
a pipelined implementation is obtained. A CTL specification is properties starting in unreachable states may be produced.
transformed along with the design to prove the correctness of These false negatives need to be removed by adding invariants
the refinement steps. A similar approach is presented in [22]. in order to restrict the starting state. For more details on the
It decomposes the correctness proof for a complex pipelined idea of IPC and the following formalization, refer to [23].
machine with branch prediction into several steps, the first A synchronous circuit is modeled as a finite state machine
of which proves the compliance of a simple version of the (FSM) 푀 = (퐼, 푆, 푆0, Δ, Λ, 푂) with input alphabet 퐼 ⊆ 픹푛, processor with its ISA. The drawback of these approaches is output alphabet 푂 ⊆ 픹푤, a finite set of states 푆 ⊆ 픹푚, output that they cannot handle industrial designs containing legacy function Λ and next state function Δ. The set 푆0 ⊆ 푆 denotes code and manual optimizations that are needed to match hard the initial states. With next state function Δ : 픹푛 ×픹푚 → 픹푚, power and timing constraints. the transition relation of the circuit is given by There are various techniques for the verification of existing 푇 (푠, 푠′) = ∃푥 ∈ 픹푛 : 푠′ ≡ Δ(푥, 푠). (1) processors [1], [13], [14], [30]. In [1], a formal pipeline model A safety property 푓 = AG(휑) is translated to a Boolean is introduced that is based on parcels (instructions) process- function [[푓 ]] ing through the pipeline. By instantiating several predicates point 푡, checking the validity of formula 휑 at time- describing the pipeline, the correctness of the design can 푡. Here, the translation is done such that a satisfying be proved formally. However, the model is rather abstract assignment of [[푓 ]]푡 corresponds to a counterexample of 휑. and the predicates seem difficult to derive. In contrast, we The resulting function depends on the inputs, outputs and provide a clear distinction between the architecture layer and states within a bounded time interval [0, 푐]. IPC searches for the mapping to the implementation. Furthermore, our mapping counterexamples by solving the SAT instance functions have a more intuitive counterpart in the designer’s ^푐 intent of implementing a pipeline. 푇 (푠푡+푖, 푠푡+푖+1) ∧ [[푓 ]]푡. (2) Further approaches for processor verification rely on inter- 푖=0 active theorem proving [18], [26], [29]. This generally offers The transition relation is unrolled within the time interval a high level view on the design. Theorem proving however [0, 푐] and it is connected to the single instantiation of [[푓 ]]푡. requires a significant level of expertise that is usually not In order to avoid unreachable counterexamples, invariants are available to designers or verification engineers in practice. added. In many cases, such invariants can even be generated Approaches for the automatic generation of properties are automatically [31]. In the context of the described method- given in [17], [25]; they are based on learning dependencies ology, the needed invariants are usually less complex than
Architecture
Description Property Property Formal A C time
Generation Suite Equivalence RTL
Mapping Proof Assertions
Information
reset
Fig. 1. Verification Flow with Property Generation A fetch decode execute ... C
write−
the main properties; they can thus be verified using inductive back
proof techniques like 푘-induction [28]. For more details on the A C
method, refer to [23]. ... write−
IPC is a powerful verification technique, enabling the for- fetch decode execute back
malization of a specification in terms of safety properties ...
and its verification against the implementation. However, to
be sure that no bugs have been missed, the verification Fig. 2. Interaction of generated properties
engineer needs to reason about the completeness of the written
property suite. A technique to formally check whether a set After identifying the visible registers and interfaces, the
of properties forms a complete specification is described in instruction set and the exception behavior of the processor are
[5], [8]. These techniques have been successfully applied to described on the architecture level. The generated property
industrial processor designs [6]. suite consists of instruction properties and a set of consis-
Completeness analysis determines whether every possible tency assertions. While each instruction property describes the
input scenario—corresponding to a transaction sequence of processing of a single instruction until it leaves the pipeline,
the design—can be covered by a chain of properties that the consistency assertions ensure the correct interaction of
predicts the value of states and outputs at every point in time. multiple instructions and the consistent pipeline behavior, if no
In other words, any two designs fulfilling all the properties instruction is present in a dedicated stage. The latter includes
of a complete property suite are formally equivalent. The e.g. checking that empty stages will not update any state
completeness analysis basically boils down to check in the end elements. These assertions also help the user in finding an
state of each property whether (1) there is always a successor appropriate mapping by giving him a feedback for debugging.
property with matching assumptions, (2) the successor prop- Basically, the equivalence of the property suite and the DUV
erty is uniquely determined and (3) each property describes is established by chaining the generated properties, as shown
the outputs and states of the design uniquely. For more details in Figure 2. Each property is depicted as a rectangular box,
on the methodology please refer to [5], [8]. consisting of an assume part (assumption A) and prove part
For the formal verification of the generated property suite (consequent C, shaded gray in the figure). The properties are against the RTL, we use OneSpin 360MV [24]. This com- hooked up at the time point when the processor is ready to mercial solution covers the required spectrum of formal execute the next instruction, represented by the big black dots verification—from the verification of SystemVerilog assertions in the picture. Thus, starting from reset, the first property all the way to the automatic completeness analysis described proves that the new instruction state (NIS) will be reached. above. Among various other proof engines, 360MV also offers Then, the following properties assume the NIS and prove that IPC and 푘-induction with sufficient capacity and performance after fetching the dedicated instruction, NIS will be reached to handle the complete verification of processors [6]. again, enabling the connection to the next instruction property. IV. VERIFICATION USING GENERATED PROPERTIES The basic approach has been described in detail in [20]; it is based on a patent application [7]. Technically, the basis of the approach presented here is to provide a general formal processor model that can be A. General Processor Model customized by the user to match his specific implementation. The approach presented here is limited to a class of pro- The general processor model can be thought of as a tool box cessors that is common in industrial designs. The class is with several design features to be picked out. The customiza- characterized by the following features: tion is done by setting the architecture design parameters, ∙ Single pipeline like the number of pipeline stages and the possible interface ∙ In-order-execution, out-of-order termination transactions. Furthermore the mapping from the architecture ∙ Register files with multiple prioritized write channels description to the RTL has to be established by defining a ∙ Exceptions and interrupts number of mapping functions. The basic flow is shown in ∙ Delayed branch instructions Fig. 1. The general processor model consists of three parts: ∙ Branch prediction 1) The pipeline model describes the movement of the ∙ Multicycle instructions instructions through the stages ∙ Multiple interfaces, including pipelined protocols 2) The datapath model describes register access and data Note that a typical CPU also contains complex data memory forwarding and prefetch logic. With our approach, the core of such a 3) The interface model describes memory and bus ac- CPU can be verified with generated properties, providing cesses exact interface descriptions to the data memory and prefetch.
INSTR 2 INSTR 1 RESET
These modules can in turn be verified manually, thus ensuring Arithmetic Instruction ADD opcode = 11000
correctness of the overall CPU. In such manual extensions, the No memory access.
already established mappings and models can easily be reused. 15 opcode11 10ra 8 7 rb 5 4 rt 2 R[rt] := R[ra] + R[rb];
The components that are included in the architecture view (a) Specification
are described in the following. A processor receives its instruc- tions via an instruction memory interface, that is addressed by 1 registers := R; a program counter 푃 퐶. The currently processed instruction 2 interfaces := DMEM; word is denoted 퐼푊 . There can be an arbitrary number of 3 transactions_DMEM := IDLE, READ, WRITE; 4 architecture registers and flags. There can be interfaces to data 5 simple_instruction ADD { memories or buses, each associated with a set of transactions, 6 TRIGGER_STATE := true; at least containing the idle transaction. 7 TRIGGER_IW := IW[15:11] == ADD_op; 8 UPDATE_PC := (PC + 2)[7:0]; The instructions are described on the architecture level. 9 VREGISTER_1 := R(IW[10:8]); In order to verify them against the RTL implementation, a 10 VREGISTER_2 := R(IW[7:5]); mapping has to be established by the user. For the components 11 UREGISTER_1 := R(IW[4:2]); 12 UPDATE_1 := (VREGISTER_1 + VREGISTER_2)[15:0]; 푃 퐶 and 퐼푊 , the corresponding mapping function is usually 13 DMEM_IDLE; } pointing to a dedicated signal in the RTL showing the value of (b) Architecture description the program counter and an instruction register, respectively. Fig. 3. Informal specification and architecture description example However, the mapping of an architecture register file requires a model of the pipelining due to forwarding mechanisms that C. Pipeline Model are part of every pipelined processor design. We first introduce In a pipeline, the processing of instructions is overlapped the architecture description, followed by a discussion of the in order to speed up computations. Thus, a new instruction models for the pipeline, the data path, and the interfaces. starts before the preceding one has terminated. For example, Finally, the generation of the property suite is described, and a typical simple pipeline would partition an instruction into the completeness of the model is discussed. fetching the instruction word from the memory, decoding it, B. Architecture Description executing logical and arithmetic operations and writing the In our approach, there is a clear distinction between the result back into the register file. Note that this section only architecture description and the mapping functions. In this introduces basic pipeline modeling for the control path in order way, a readable and proven correct description of the ISA is to keep track of the different instructions in the pipeline. The obtained. The mapping functions relate the ISA to the RTL. handling of forwarding is part of the data path of a pipeline In the first section of the architecture description, the com- and discussed in the following Sect. IV-D. ponents of the processor are listed, comprising all architecture The major challenge in designing a correct pipeline are registers and flags. Furthermore the interfaces to memories and hazards, i.e. conflicts between instructions that are processed buses are given, as well as the respective transaction types on at the same time in different stages. If an instruction needs data these interfaces. The main section of the architecture descrip- that is currently being computed by a preceding instruction, a tion consists of the ISA description, where all instructions of read-after-write conflict occurs and the succeeding instruction the processor are defined. In the ISA description, the registers needs to wait for the data. Thus, a mechanism to stall a stage are referred to by their specification name. is needed. Another hazard is related to branching instructions. For each instruction, first the execution condition is given When a jump is taken, this is typically detected at a time (TRIGGER). Then, the updates of the program counter and the when subsequent instructions from the sequential program architecture registers and flags need to be defined, followed flow already have been fetched. Therefore, the pipeline must by the definition of one transaction per interface. The updates possibly be cleaned from wrongly fetched instructions, requir- are defined by the read registers (VREGISTER), the target ing a cancel mechanism. As this may lead to stages that are register (UREGISTER) and the value that will be written by not processing any instructions, it is desirable to distinguish the instruction (UPDATE). between empty and full stages to prevent spurious register updates or similar faults. Based on these requirements, we As an example, consider Fig. 3(b) with a simple processor now define our pipeline model. description including an ADD instruction. The triggers for the Given the number of pipeline stages 푛, we define the set instruction are divided into two statements, one of which 풮 = {1, 2, . . . 푛} of pipeline stages. The pipeline architecture only depends on the architecture state (TRIGGER_STATE), is further classified by defining some constants that refer to while the second one depends on the instruction word certain stages like the decode stage 푑푒푐 ∈ 풮 and the stages (TRIGGER_IW). Besides the update of the program counter 푖푎, 푖푣 ∈ 풮 that denote the stage when the instruction memory in line 8, there is one update of the register R, where two is accessed and when the instruction word is valid, respec- registers are read addressed by parts of the instruction word tively. The processing of instructions by the pipeline is defined (lines 9 and 10). The target register is given in line 11 and the by the mapping functions1 푓 푢푙푙, 푠푡푎푙푙, 푐푎푛푐푒푙 : 풮 → 픹. sum of the two source registers is defined in line 12. Finally, The value of 푓 푢푙푙(푠) reflects if the pipeline stage 푠 currently there is no transaction on the data memory interface, indicated by the statement DMEM_IDLE in line 13. 1 The state of the design is an implicit parameter of all mapping functions.
t1 t2 t3 t1,1 t2,1 t3,1 t2,2 t3,2 t2,3 t3,3
stall(1) stall(1)
stage 1 stage 1
stall(2) stall(2)
stage 2 dispatch(2)
stall(3) stage 2
stage 3 stall(3)
Fig. 4. Normal processing of instructions by the pipeline stage3
Fig. 6. Pipeline for a multicycle instruction
t1 t2 tjmp
stall(1) ping functions for of out-of-order termination, exceptions, and
stage 1 multicycle instructions need to be defined.
stall(2) 푙푎푠푡 푠푡푎푔푒, 푖푛푗푒푐푡, 푑푖푠푝푎푡푐ℎ : 풮 → 픹, (3)
stage 2 Here, 푙푎푠푡 푠푡푎푔푒(푠) indicates that the instruction in stage 푠
stall(3) will leave the pipeline, 푖푛푗푒푐푡(푠) states that an instruction will
stage 3 be injected into stage 푠 in the next cycle due to an exception,
and 푑푖푠푝푎푡푐ℎ(푠) describes that a multicycle instruction is
Fig. 5. Pipeline for a taken jump instruction started in stage 푠. The pipeline of a multicycle instruction
according to our model is shown in Fig. 6. There, the partial
holds an instruction. If 푠푡푎푙푙(푠) is true, the instruction in stage instructions are dispatched in stage 2. 푠 will not proceed to the succeeding stage 푠 + 1. 푐푎푛푐푒푙(푠) indicates that the instruction currently in stage 푠 will be D. Data Path Model removed from the pipeline and will have no more effects Based on the above control path model of the pipeline, we in later stages. The normal processing of two consecutive can now define the data path model, describing the way how instructions is shown in Fig. 4, where time progresses from left data is read, forwarded and stored in the registers. to right. The time-points when the first instruction is allowed For a register file 푅, a mapping function 푐푢푟푟푒푛푡푅 : ℐ푅 → to proceed to the next stage are denoted by 푡1 to 푡3, i.e., 풟푅 is defined that returns the current implementation state of 푠푡푎푙푙(푠) evaluates to false at 푡푠. The boxes indicate the time- the register, where ℐ푅 is the index set and 풟푅 is the data points when the respective stage is processing an instruction, domain of register 푅. For the data path of the register the i.e. 푓 푢푙푙(푠) = 1. The pipeline for a taken jump or mispredicted following mapping functions have to be defined: branch instruction is shown in Fig. 5. At timepoint 푡푗푚푝, the 푤푟푖푡푒 canceling of two succeeding instructions is indicated by the 푅, 푣푎푙푖푑푅 : 풮 → 픹 (4a) dark boxes. After the taken jump, the target instruction is 푑푒푠푡푅 : 풮 → ℐ푅 (4b) fetched from the instruction memory. 푑푎푡푎푅 : 풮 → 풟푅, (4c) The mapping functions have to be defined by the user. This where 푤푟푖푡푒 means for example, that the user needs to identify how the 푅(푠) indicates if the instruction in stage 푠 is going implementation encodes the fact that a stage is full. Since the to update register 푅, while 푑푒푠푡푅(푠) and 푑푎푡푎푅(푠) specify functions are used in the properties, the verification fails as the target register and the data to be written, respectively. By long as the model is not completed properly. 푣푎푙푖푑푅(푠), it is stated if stage 푠 already produces a valid result. In addition to the basic model, further pipeline operations With these building blocks, the forwarding in the pipeline to can be supported. It is common for instructions to leave the some forwarding target stage 푠 ∈ 풮 can easily be captured: the pipeline before the last stage, if no more actions will be value of a register 푅 with index 푖 ∈ ℐ푅 in the forwarding target taken in later stages, in order to prevent conflicts. Thus, a stage 푠 is recursively given by checking whether succeeding last stage can be defined for each instruction. Exceptions are stages write to register 푖; this corresponds to the forwarding a crucial feature for practical applications. By nature, they logic in the pipeline. interrupt the normal instruction processing. The most general ⎧ 푐푢푟푟푒푛푡 (푖), if 푠 ≥ 푤푟푖푡푒푏푎푐푘 ⎨ 푅 푅; exception model, that is still suited to conform with our approach, is an injection of a new instruction into the pipeline 퐷푎푡푎푅(푠, 푖) = 푑푎푡푎 푠 + 1), if 푤푟푖푡푒 (푠 + 1)∧ 푅 after an exception has been acknowledged. Finally, for more 푅( 푑푒푠푡 (푠푅+ 1) = 푖; ⎩ complex arithmetic operations or interactions with protocol 퐷푎푡푎푅(푠 + 1, 푖), otherwise. driven interfaces, multicycle instructions are frequently used Note that this automatically generated function 퐷푎푡푎푅 in processor designs. Typically, an FSM in an early stage is actually captures the complex mapping of the visible register responsible for dispatching partial instructions in the pipeline. 푅 to the implementation, i.e., the architecture value of 푅 for For these refinements of the simple model, additional map- an instruction in the pipeline is the value of 퐷푎푡푎푅 in the
forwarding target stage of that instruction. Since the value of TABLE I
퐷푎푡푎푅 may be invalid because the result of some instruction is USER INPUT FOR PROPERTY GENERATION
not available yet, we introduce an additional mapping function (a) Constants
capturing whether the forwarding data is indeed valid: Name Domain Description
⎧ 푛 ℕ number of stages
false , if 푠 < dec;
푑푒푐 풮 = {1, . . . , 푛} decode stage
⎨true , if 푠 ≥ 푤푟푖푡푒푏푎푐푘푅; 푖푎 풮 instruction memory access stage
푉 푎푙푖푑푅(푠, 푖) = if 푤푟푖푡푒 (푠 + 1)∧ 푖푣 풮 stage in which instr. word is valid
푣푎푙푖푑 (푠 + 1), 푅 푖푛푡 풮 highest stage for interrupt injection
푅 푑푒푠푡 푠
⎩푉 푎푙푖푑푅(푠 + 1, 푖), 푅( + 1) = 푖; 푑푎IF 풮 access stage for interface IF
otherwise. 푑푣IF 풮 data valid stage for interface IF
푤푟푖푡푒푏푎푐푘푅 풮 writeback stage for register 푅
E. Interface Transactions (b) Mapping functions
In order to verify the interfaces of the processor, the Arch. Function Signature Description
following model is used. For each interface 퐼퐹 the constants Basic components
푑푎퐼퐹 , 푑푣퐼퐹 ∈ 풮 denote the stage when a data access is issued PC 푝푐 ℕ program counter
and when valid data is returned, respectively. For each inter- IW 푖푤 ℕ instruction word
face, a set of transactions 푇 퐴퐼퐹 is defined by the user with at Pipeline Model
least 퐼퐷퐿퐸 ∈ 푇 퐴퐼퐹 . For each transaction 푡푎 ∈ 푇 퐴퐼퐹 a func- 푓 푢푙푙 풮 → 픹 stage active
tion 푡푎퐼퐹 : ℕ × ℕ → 픹 is defined, where 푡푎퐼퐹 (푎푑푑푟, 푤푑푎푡푎) 푠푡푎푙푙 풮 → 픹 stage stalled
captures that the specified transaction takes place in the 푐푎푛푐푒푙 풮 → 픹 stage is canceled
푖푛푗푒푐푡 풮 → 픹 inject launch instr.
design, optionally involving the address 푎푑푑푟 and (for writing 푑푖푠푝푎푡푐ℎ 풮 → 픹 dispatch micro instr.
transactions) the write data 푤푑푎푡푎. As for the example in 푙푎푠푡 푠푡푎푔푒 풮 → 픹 instr. leaves pipeline
Fig. 3(b), the three functions 퐼퐷퐿퐸퐷푀 퐸푀 , 푅퐸퐴퐷푀 퐸푀 8 푐푢푟푟푒푛푡 Datapath Model
푅 푅
and 푊 푅퐼푇 퐸퐷푀 퐸푀 need to be defined, capturing for given > 풟 implementation register
>
address and data, if the respective transaction is issued on the R < 푤푟푖푡푒푅 풮 → 픹 stage will write
푅 푅 write destination
> 푑푒푠푡 풮 → ℐ
>
data memory interface. : 푑푎푡푎푅 풮 → 풟푅 write data
Besides the transactions, for each interface a mapping 푣푎푙푖푑푅 풮 → 픹 data is valid
function 푟푑푎푡푎퐼퐹 points to the implementation port, where 푖푏푢푠 푟푒푎푑 Interfaces
data is read in to the processor. Finally, there is a static IF TA 푡푎IF ℕ → 픹 instruction fetch
interface to the instruction memory, given by the predicate IF RDATA 푟푑푎푡푎IF ℕ × ℕ → 픹 transaction
ℕ read data
푖푏푢푠 푟푒푎푑 : ℕ → 픹, which checks if the instruction memory
is currently being accessed for a given value of the program
counter. or stalled, and 푠 is empty or it will proceed to the next stage,
then 푠 must be empty in the next cycle. Here, 푓 푡 denotes the
F. Consistency Assertions value of 푓 at timepoint 푡. Other assertions ensure, for example,
While the above models describe the processing of instruc- that instructions do not overwrite each other and that empty
tions by the successive pipeline stages, additional assertions pipeline stages do not have an effect on the visible registers
are needed for the overall correctness of the processor. This or issue interface transactions.
includes the behavior of empty pipeline stages as well as the
interaction of succeeding instructions. For this purpose, a set G. Generating The Property Suite
of consistency assertions are automatically generated. In order to adapt the general processor model to the actual
Note that the overall verification is fail safe, i.e. it cannot DUV, the user needs to specify the mapping functions de-
succeed if the design is not correct. But, even for a correct scribed in Sections IV-C to IV-E. The user input is summarized
design, finding the appropriate mapping functions can be in Table I. Besides the basic data on the pipeline, given by
difficult. The consistency assertions provide useful information a set of constants, the table shows the mapping functions
on the status of the modeling. Failing assertions can point corresponding to the architectural components of the general
the user to certain mapping functions that need to be revised processor model.
to complete the verification, thereby guiding the debugging During the generation of the property suite, an architecture
process. register 푅(푖) is replaced by an instantiation of the function
We show the following assertion as an example. For a more 퐷푎푡푎푅(푠푓 푤푑, 푖), where 푠푓 푤푑 is the forwarding target stage, detailed description of the consistency assertions, see [20]. which is usually the decode stage. ∀푠, 2 ≤ 푠 ≤ 푛 : The generated properties prove the correctness of the in- ( (¬푓 푢푙푙푡(푠 − 1) ∨ 푠푡푎푙푙푡(푠 − 1))∧ (5) structions on the implementation level. For this, we define 0 to be the timepoint when the respective instruction enters (¬푓 푢푙푙푡(푠) ∨ ¬푠푡푎푙푙푡(푠)) ) ⇒ ¬푓 푢푙푙푡+1(푠) 푡 the pipeline and 푡푖 > 푡푖−1, 1 ≤ 푖 ≤ 푛 to be the timepoints This assertion states that it is illegal to create full stages in when the instruction is allowed to proceed from stage 푖 (see the middle of the pipeline: when the stage before 푠 is empty also Fig. 4–6). The properties have an implication structure
퐴 ⇒ 퐶. Whenever the assumptions 퐴 evaluate to true, the given, followed by a presentation of the verification results.
prove part 퐶 must hold as well. In the following, we give an Besides, during its development, the method has been applied
overview of the templates for instruction execution without for the complete verification of smaller processor designs from
exceptions. There are similar templates for exceptions; for the opencores site (www.opencores.org). Details cannot
more details, see [7]. Note that there are two templates for be given here due to page limitation.
conditional branches depending on whether the branch is taken
or not. In this way, branch prediction can easily be modeled. A. PCP Processor
The assume part for an instruction 푚 basically consists of the The PCP processor is a control processor that is part
following assumptions: of automotive systems. Its main purpose is the monitoring
∙ The instruction enters the pipeline at 푡0. of peripheral components in order to release the central
∙ In decode, instruction 푚 is triggered. CPU [11]. Therefore, a great share of the instruction set
∙ The instruction proceeds from stage 푖 at 푡푖, 푖 ≤ 1 ≤ 푛. is dedicated to data transfer and bit operations, which are
∙ The instruction is not canceled by preceding instructions frequently used in typical control applications. The PCP is and not replaced by an exception call. connected to a data memory and a pipelined FPI bus (Flexible For each instruction, mainly the following will be proved: Peripheral Interface). As the bus operations require complex ∙ The instruction is fetched from the instruction memory protocol transactions, 35% of the instruction set are multicycle ∙ The program counter is updated correctly instructions. In total, the PCP has 66 instructions, divided ∙ The full stages are correctly tagged by the 푓 푢푙푙 function into arithmetic/logic instructions, jump and control, memory ∙ No 푐푎푛푐푒푙 is generated (except for jump instructions) instructions, bus instructions and complex math instructions. ∙ All read registers are valid The processor is implemented as a four stage pipeline. The ∙ The registers will be updated (or remain stable) corre- register file contains 8 registers of 32 bit, where one register is sponding to the ISA; this includes the verification of a special purpose register containing various status flags and correct forwarding the program counter. The whole RTL implementation adds up ∙ The correct transactions will take place on the interfaces to about 17.000 lines of VHDL code, accompanied by a de- tailed informal specification. Regarding the complexity of the H. Completeness design and the quality of the source code and documentation, The pipeline model is built such that the final property suite the time for the formal verification was estimated with 8 to 10 in combination with the consistency assertions is complete person months, needed to manually write a complete property by construction, if some rules are respected for the definition suite using OneSpin 360 MV. of the mapping functions. For a proof for the basic pipeline model, see [7]. B. Results However, the completeness of a concrete generated property The PCP has been verified using the presented approach. suite additionally depends on the proper definition of some of The informal specification was ported to an architecture de- the functions. If, for example, the user defined the function scription. Most of the manual effort was spent for the definition for a read transaction by simply returning true, it is obvious and refinement of the pipeline and datapath model, given by that the interface signals are not checked at all for read the mapping functions explained in Sect. IV-C to IV-E. Using transactions and there is a gap in the verification. In summary, our approach, the instruction set of the PCP could be success- the generation ensures that all possible scenarios are covered fully verified except for two highly complex bus instructions with properties, but not that all transactions verify all outputs. involving nested loops and excluding three complex math However, the automatic gap detection of OneSpin 360 can be instructions. For these instructions, the control mechanisms used to close these gaps as well. of the PCP did not match our general pipeline model. It does V. APPLICATION not seem useful to extend the model for these cases, as they are very specific to the PCP implementation. Instead, having The above method has been implemented as a front-end found a good representation of most of the functionality based for OneSpin 360 MV; we call it FISACO (Formal Instruction on our processor model, the defined functions can be reused Set Architecture Compiler). It takes an architecture description for further manual verification. This has also been done for and automatically generates the instruction properties and the some additional functionality beyond the ISA, like loading consistency assertions in a form readable for 360MV. The and storing full register contexts. The overall verification of the mapping information needs to be supplied in a temporal logic PCP with our methodology took about 5 person months. Thus, format. The processor model formed by both the architecture we could achieve an estimated productivity gain of 100%. description and the mapping information can then be verified The verification has been carried out on 2.2 GHz work- and debugged using 360MV. station with 16 GB memory. Details on the proof times and In the following we will describe the application of the the used memory can be found in Table II. As can be seen, proposed method on an industrial processor design. We suc- the generated consistency assertions could be proved quickly. cessfully verified a control processor that is used in automotive Most of the time was spent for the verification of arithmetic applications, the Peripheral Control Processor (PCP) by In- and bus instructions. The latter ones are mostly multicycle fineon Technologies. First, the basic data of the PCP will be instructions and thus the design needs to be unrolled for up to
TABLE II
VERIFICATION RESULTS [7] J. Bormann, S. Beyer, and S. Skalberg. Equivalence verification between
transaction level models and RTL at the example of processors, 2008.
Category Properties Total time Avg. time Memory [8] European Patent Application, publication number EP1933245.
assertions 34 600 s 17.6 s 937 MB J. Bormann and H. Busch. Method for determining the quality of a set
arithmetic 15 18788 s 1252.5 s 2500 MB of properties, applicable for the verification and specification of circuits,
2007. European Patent Application, publication number EP1764715.
logic/bit 18 3368 s 187.1 s 2500 MB [9] G. Braun, A. Nohl, A. Hoffmann, O. Schliebusch, R. Leupers, and
jump 7 220 s 31.4 s 2500 MB H. Meyr. A universal technique for fast and flexible instruction-set
memory 16 2135 s 133.4 s 2500 MB architecture simulation. IEEE Transactions on CAD, 23(12):1625–1639,
bus 10 3214 s 321.4 s 2500 MB 2004.
other 3 147 s 49.1 s 1763 MB [10] C. Braunstein and E. Encrenaz. Formalizing the incremental design and
total 103 7:54 h verification process of a pipelined protocol converter. In IEEE Int’l
Workshop on Rapid System Prototyping (RSP), pages 103–109, 2006.
[11] S. Brewerton. Dual core processor solutions for IEC61508 SIL3 vehicle
26 cycles. Note that the two most difficult instructions make [12] safety systems. In Embedded World Conference, 2007. R. Brinkmann, P. Johansson, and K. Winkelmann. Advanced Formal up 3 hours and 48 minutes or 48% of the total runtime. Verification, chapter Application of Property Checking and Underlying Techniques, pages 125–166. Kluwer Academic Publishers, 2004. VI. CONCLUSIONS [13] R. Bryant, S. German, and M. Velev. Processor verification using effi- We have presented an approach for the automatic generation cient reductions of the logic of uninterpreted functions to propositional logic. ACM Transactions on Computational Logic, 2(1):93–134, 2001. of a complete property suite from an architecture description [14] J. Burch and D. Dill. Automatic verification of pipelined microprocessor of a processor. There is a clear distinction between the control. In Proc. of the Int’l Conf. on Computer Aided Verification, pages architecture model and the mapping information connecting [15] 68–80, 1994. W. B¨uttner. Complex hardware modules can now be made free of architecture to RTL implementation. The architecture model functional errors without sacrificing productivity. In Proc. of Int’l Conf. can be easily derived from an informal specification. [16] on Abstract State Machines, B and Z, LNCS, pages 1–3. Springer, 2008. The mapping from the specification to the implementation is A. Chattopadhyay, A. Sinha, D. Zhang, R. Leupers, G. Ascheid, and H. Meyr. Integrated verification approach during ADL-driven processor based on a general pipeline model that reflects the designer’s design. In IEEE Int’l Workshop on Rapid System Prototyping (RSP), intent in implementing a correct pipelined processor. A set [17] pages 110–118, 2006. of consistency assertions is automatically generated to check G. Fey and R. Drechsler. Design understanding by automatic property generation. In Workshop on Synthesis And System Integration of Mixed the correctness of the model and helps the user in finding Information technologies (SASIMI), pages 274–281, 2004. a suitable mapping. When the mapping and the architecture [18] R. Hosabettu, G. Gopalakrishnan, and M. Srivas. Verifying advanced description are finished, the generated property suite forms a microarchitectures that support speculation and exceptions. In Proc. of the Int’l Conf. on Computer Aided Verification, pages 521–537, 2000. model of the design, i.e. the verification is exhaustive. [19] D. Kroening and W. J. Paul. Automated pipeline design. In Proc. of the The practicability of the approach has been demonstrated [20] 38th Conference on Design Automation, pages 810–815. ACM, 2001. on an industrial processor design, a control processor from U. K¨uhne. Advanced automation in formal verification of processors. PhD thesis, University of Bremen, 2009. the automotive domain. With the presented methodology, the [21] U. K¨uhne, S. Beyer, and C. Pichler. Generating an efficient instruction estimated verification productivity could be doubled. set simulator from a complete property suite. In Proc. of the IEEE Int’l In the future, we want to integrate this approach with our au- [22] Symposium on Rapid System Prototyping, 2009. P. Manolios and S. Srinivasan. A complete compositional reasoning tomatic generation of efficient instructions set simulators (ISS) framework for the efficient verification of pipelined machines. In Proc. [21]. This allows to generate both a complete property suite [23] of the Int’l Conf. on Computer Aided Design, pages 863–870, 2005. and an efficient ISS from a common architecture description, M. Nguyen, M. Thalmaier, M. Wedler, J. Bormann, D. Stoffel, and W. Kunz. Unbounded protocol compliance verification using inter- ensuring that the generated ISS complies to the verified RTL val property checking with invariants. IEEE Transactions on CAD, code. A complementary extension would be the use of existing [24] 27(11):2068–2082, Nov. 2008. ADL like LISA, facilitating the integration of formal methods OneSpin Solutions GmbH, Munich, Germany. OneSpin Verification Solutions. http://www.onespin-solutions.com, 2009. into the tool chain for processor design. [25] F. Rogin, T. Klotz, G. Fey, R. Drechsler, and S. R¨ulke. Automatic generation of complex properties for hardware designs. In Design, REFERENCES [26] Automation and Test in Europe, 2008. J. Sawada and W. Hunt. Processor verification with precise exceptions [1] M. Aagaard. A hazards-based correctness statement for pipelined and speculative execution. In Proc. of the Int’l Conf. on Computer Aided circuits. In CHARME, pages 66–80, 2003. Verification, pages 135–146, 1998. [2] S. Beyer, C. Jacobi, D. Kroening, D. Leinenbach, and W. Paul. Putting [27] E. Schnarr, M. Hill, and J. Larus. Facile: a language and compiler it all together—formal verification of the VAMP. STTT Journal, 8(4– for high-performance processor simulators. In Proc. of the Conf. on 5):411–430, Aug. 2006. Programming language design and implementation, pages 321–331, [3] A. Biere, A. Cimatti, E. Clarke, and Y. Zhu. Symbolic model checking 2001. without BDDs. In Tools and Algorithms for the Construction and [28] M. Sheeran, S. Singh, and G. St˚almarck. Checking safety properties Analysis of Systems, volume 1579 of LNCS, pages 193–207. Springer using induction and a SAT-solver. In FMCAD, pages 127–144. Springer, Verlag, 1999. 2000. [4] N. Bombieri, F. Fummi, G. Pravadelli, and J. Marques-Silva. Towards [29] J. Skakkebæk, R. Jones, and D. Dill. Formal verification of out-of-order equivalence checking between TLM and RTL models. In Int’l Conf. on execution using incremental flushing. In Proc. of the Int’l Conf. on Formal Methods and Models for Codesign, pages 113–122, 2007. Computer Aided Verification, pages 98–109, 1998. [5] J. Bormann. Vollstaendige funktionale Verifikation. PhD thesis, Univer- [30] M. Velev. Formal verification of pipelined microprocessors with delayed sity of Kaiserslautern, 2009. branches. In Int’l Symp. on Quality Electronic Design, page 4pp., 2006. [6] J. Bormann, S. Beyer, A. Maggiore, M. Siegel, S. Skalberg, T. Black- [31] M. Wedler, D. Stoffel, and W. Kunz. Exploiting state encoding for more, and F. Bruno. Complete formal verification of TriCore2 and other invariant generation in induction-based property checking. In Asia and processors. In Design and Verification Conference (DVCon), 2007. South Pacific Design Automation Conference, pages 424–429, 2004.