Source b848e815... — STIMSMITH

SOURCE ARCHIVE

SHA256: b848e8157efea6349a1097e56a99547df2492911b5dc5e092d321febda13eecc

URL: https://agra.informatik.uni-bremen.de/doc/konf/2009-rsp-issgeneration.pdf

TYPE: application/pdf

SIZE: 174.2 KB

FETCHED: 5/29/2026, 11:26:35 PM

EXTRACTOR: liteparse

CHARS: 44,759

EXTRACTED CONTENT

44,759 chars

Generating an Efficient Instruction Set Simulator from a Complete Property Suite

    Ulrich Kühne Sven Beyer Christian Pichler

Institute of Computer Science OneSpin Solutions GmbH University of Bremen Theresienhöhe 12 28359 Bremen, Germany 80339 Munich, Germany ulrichk@informatik.uni-bremen.de {Sven.Beyer, Christian.Pichler}@onespin-solutions.com

    Abstract                                                        the ISA with a certain degree of independence, there is a
                                                                    risk that the actual design behaves differently from the ISS
Instruction  set simulators   can  be   used for   the      early   in some cases. Such a discrepancy between ISS and design
development and testing of software for a processor before          may lead to erroneous software: while the software behaves
it is manufactured. While gate-level simulation offers cycle-       as expected in the ISS, it does not work properly on chip.
accurate results, performance of the simulation is typically            In order to avoid the manual effort of developing an ISS,
not sufficient for in-depth software testing. In addition, such     it is also possible to automatically derive an ISS from a
a gate-level simulation cannot be carried out in the early          high level or register transfer level (RTL) description of a
phases of the design process when only the instruction              processor [3]. Note that for optimized or pipelined RTL
set architecture (ISA) is present and the design is not yet         designs, such a higher level description is quite different
complete. Therefore, more abstract simulators are based             from the actual design. Therefore, the automatic extraction
on the ISA; these simulators can achieve a performance              of the ISS from the actual design is not feasible for these
of several million instructions per second. However, by             cases.
introducing a simulator separate from the design, the ISA                   In order to achieve an ISS that really corresponds to
has to be re-implemented for the simulator. Therefore, there        the design, the ISS needs to be derived from the ISA
is a risk that the instruction set simulator is not in sync         that is actually used in verification. What is more, the
with the design or the ISA. We present an approach to               verification should be carried out formally because only
automatically generate an instruction set simulator from a          formal verification offers the  chance of eliminating     all
complete property suite, which can be used for the formal           discrepancies between ISA and the design. Today, formal
verification of the processor. In  this way, we    obtain       a   hardware verification is already used in industry. Especially
provably correct simulator with relatively small effort. We         for safety critical systems involving medium size processor
show the feasibility of the approach for an industrial design;      designs and embedded systems, formal verification can offer
the performance of the resulting simulator is shown to be           high quality solutions [4], [5]. One successful technique is
comparable to custom state-of-the-art simulators.                   Interval Property Checking (IPC) [6], a technique similar to
                                                                    Bounded Model Checking [7]. IPC is used in order to check
1. Introduction                                                     if a design satisfies a set of properties which is written in
                                                                    a dedicated verification language. In contrast to simulation
        In today’s processor and system design flows, instruction   based methods which are not able to exhaustively cover all
set simulators (ISS) play an important role. One major field        possible inputs for large designs, formal methods allow for
of application of ISS is pre-silicon software development,          a gap-free, i.e., complete verification. Here, complete means
enabling the simulation of software before the target system        that the properties capture the behavior of the design in
is manufactured or even the design is finished.                     a unique way for each possible combination of states and
            With increasing system complexity, simulation perfor-   inputs. Having finished the formal verification phase, the
mance in terms of executed instructions per second has              set of properties forms a functionally equivalent model of
become an important factor. Thus, rather than simulating            the verified design. This methodology generally offers the
program code on a gate level or cycle accurate model of             highest quality of verification.
the design, ISS are based on the instruction set architecture                In this paper, we show how a complete property suite
(ISA) and implemented in high level languages like C++.             resulting from the formal verification of a processor can
For such an ISS, the ISA has to be reimplemented manually.          be reused to automatically generate an ISS. In this way,
There are several tools that provide dedicated languages for        the simulator is guaranteed to comply with the ISA that
the description of instruction set architectures (see e.g. [1],     has been used for the verification of the processor; by
[2]). However, since both design and ISS are derived from           construction, the ISS also complies to the design. This is

                         ISS                                            The  idea of generating an executable model from          a
                                                                     property suite is also used in [9]. In this work, it is even
 automatically generated              equivalent by construction     possible to generate models from incomplete specifications,
                                                                     resulting in partially nondeterministic behavior. However,
                                                                     the emphasis of [9] is on the generation of verification-
                                                                     friendly hardware designs rather than an efficient simulation
                                                                     model.
                                                                          The automatic generation of instruction set simulators is
ISA      formal equivalence proof         RTL                        examined in many publications. A common way to avoid the
                                                                     manual coding of high-performance simulators is the use of
                                                                     architecture description languages (ADL), which can then be
 Figure 1. Generating a Provably Correct ISS                         compiled into an ISS. Examples for such ADL are Facile [1]
                                                                     or  LISA      [2]. However, these approaches still require the

illustrated in Figure 1. Furthermore, by making use of an existing set of properties, the overhead for the creation of the ISS is relatively small. Nevertheless, the generated ISS offers a simulation performance comparable to state-of-the- art techniques. Note that the ISS can already be generated very early in the design and verification process, namely as soon as the ISA has been captured formally. With the completion of the verification, it is later on ensured that the generated ISS actually corresponds to the design. It turns out that even in a late phase mostly as a consequence of the formal verification—changes in the design or in the specification are likely to occur [5]. In this case, one can obtain an adapted ISS from the revised formal property suite with virtually no additional effort, using our approach. Using the property suite as a single source for the specification ensures the consistenc of software simulation with the verified design. This is a major contribution of this work. As the property suite offers a rigorous formalization of the specification, the simulator will reflect all sophisticated effects of the design that might be difficult to model using a high-level description, including e.g. exceptions and asynchronous interrupts. The paper is structured as follows: related work is dis- cussed in Section 2. In Section 3, the formal verification techniques used in this work are reviewed. The generation of the ISS is described in Section 4. In Section 5, experi- mental results are discussed, followed by the conclusions in Section 6. 2. Related Work Another approach for a tight interaction between high- level simulation and verification is presented in [8]. The method is complementary to this work in that an architectural description as starting point. From description, an implementation is generated, as assertions to ensure the correctness of the design. the approach relies on simulation-based and semi-formal verification techniques, that do not allow for a complete verification. Furthermore, it is limited to those domains where a high-level synthesis is sufficient. Highly optimized RTL designs cannot be handled. reimplementation of the processor semantics in the ADL. Thus, the functional equivalence of the ISS and the design remains to be shown. A different approach is presented in [10]. There, the start- ing point is a structural description of all the components on a processor’s data path. From this description, an instruction set is extracted automatically. The information can be used to generate an ISS as well as an RTL implementation. But, as the description can get quite complex, this does not replace of the design flow— the verification of the design. Furthermore, generated RTL code is typically not suited for highly efficient designs. 3. Background 3.1. Instruction Set Simulation y As for the technical aspects of instruction set simula- tors, there are mainly three different paradigms: interpretive simulation, compiled simulation and just-in-time compiled simulation. They differ in flexibility and performance. In- terpretive simulators decode the instructions to be executed one by one. In this way, they offer the highest flexibility concerning run-time modifiable programs. The bottleneck in interpretive simulation is the instruction decoding. Compiled simulators carry out the decoding and in some cases even static scheduling at compile time. However, this technique is not applicable for run-time modifiable code and for dynamic scheduling. Therefore, just-in-time compiled sim- ulation (JIT-CS) tries to combine the best of both worlds. In JIT-CS, information on previously decoded instructions is stored in a cache and can be reused when the instruction is executed again. In this way, a simulation performance comparable to compiled simulation can be achieved without it uses losing the flexibility of the interpretive approach [2]. this well as 3.2. Formal Verification However, Within the last two decades, there has been a great amount of research in formal verification techniques. Methods based on Boolean satisfiability (SAT) have proven to be a robust solution. One prominent technique is SAT based Bounded Model Checking (BMC), that has first been described in

         [7]. Successive improvements in performance have made   analysis integrated within an IPC verification environment
   BMC a suitable method for the formal verification of larger   is commercially available in [12].
    scale designs. For the work at hand, we use the techniques            Completeness analysis determines whether every possible
described in [6], referred to as    interval property checking   input scenario—corresponding to a transaction sequence of
   (IPC). In the following, this verification methodology will   the design—can be covered by a chain of properties that
be briefly outlined.                                             predicts the value of states and outputs at every point in time.
   In contrast to the original BMC, only safety properties are   In other words, any two designs fulfilling all the properties
  verified using IPC. As digital circuits always have a finite   of a complete property suite are formally equivalent. The
 response time, this is not a serious restriction in practice.   completeness analysis basically boils down to check in the
   It is rather natural to describe the intended behavior of a   end state of each property whether (1) there is always
design in terms of safety properties in order to formalize the   a successor property with matching assumptions, (2) the
 specification. Furthermore, this restriction leads to bounded   successor property is uniquely determined and (3) each
properties that can be checked efficiently using a SAT solver.   property describes the outputs and states of the DUV in
          The main idea of IPC is to use an arbitrary starting   a unique way. For a detailed description of the methodology
  state instead of the initial state used in BMC. Any property   please refer to [5], [11].
   that holds starting from an arbitrary state then also holds
  from any reachable state, i.e., it is exhaustively verified.   3.4. Verification Language
 Conversely, false negatives can occur in IPC, i.e. counterex-
      amples for properties starting in unreachable states may               The properties presented here are written in the ITL
         be produced. These false negatives need to be removed   language [4]. In ITL, temporal logic expressions are used to
 by adding invariants in order to restrict the starting state.   describe the behavior of a synchronous sequential system.
         For more details on the idea of IPC and the following   The discrete time steps correspond to the clock cycles of the
formalization, refer to [6].                                     described system. Figure 2 shows an example of a simple
    A synchronous circuit is modeled as a finite state machine   ITL property. Usually, the properties have an implication
     (FSM) M = (I, S, S0, ∆, Λ, O) with input alphabet I ⊆ Bn,   structure: if the expressions in the  assume       part evaluate
output alphabet  O ⊆ Bw, a finite set of states S        ⊆ Bm,   to true, then the expressions in the  prove       part must hold
   output function Λ and next state function ∆. The set S0 ⊆ S   as well. In the freeze      section, expressions can be assigned
 denotes the initial states. With next state function ∆ : Bn ×   to variables that are fixed to a certain time point, i.e. the
   Bm → Bm, the transition relation of the circuit is given by   freeze variable c1 refers to the value of signal       c at time
             T (s, s′) = ∃x ∈ Bn : s′ ≡ ∆(x, s)            (1)   point t + 1, no matter in which temporal context it is used.
A safety property      f        = AG(ϕ) can be translated to a   In the temporal expressions, the standard operators of the
Boolean function [[f ]]t, checking the validity of the formula   respective HDL language (VHDL or Verilog) can be used,
  ϕ at time point t. Here, the translation is done such that a   as well as the operators          next and prev, which shift the
satisfying assignment of [[f ]]t   corresponds to a counterex-   enclosed expression relatively one cycle into the future or
     ample of ϕ. The resulting function depends on the inputs,   the past, respectively.
 outputs and states within a bounded time interval [0, c]. IPC          Thus, the first line in the prove part of Figure 2 states
searches for counterexamples by solving the SAT instance         that at time point t + 1, the value of signal        c must have
                                                                 increased by one compared to the previous time step. The
                 ∧c     T (st+i, st+i+1) ∧ [[f ]]t         (2)   next line shows an equivalent expression, making use of the
                                                                 freeze variable c1.
                 i=0                                                      ITL also supports macro functions, which are a powerful
  The transition relation is unrolled within the time interval   mechanism to achieve abstraction and write high-level prop-

[0, c] and it is connected to the single instantiation of [[f ]]t. erties. Furthermore, all data types of the respective HDL are supported, including arrays as well as user defined types and 3.3. Completeness nested record data types in VHDL. IPC is a powerful verification technique, enabling the 4. Generating the Instruction Set Simulator formalization of a specification in terms of safety properties and its verification against the implementation. However, to With the techniques presented above, it is possible to be sure that no bugs have been missed, the verification en- perform a complete verification even for industrial designs gineer needs to reason about the completeness of the written [4], [5]. If the verification is completed successfully, the property suite. A technique to formally check whether a set property suite forms a model of the verified design, i.e. the of properties forms a complete specification is described properties describe the transitions and the output behavior in [11]. Applications of the method on industrial proces- of the design in a unique way. This fact and the use of sor designs can be found in [5]. Automatic completeness abstraction in the verification can be exploited to obtain an

property simple; property instrADD; freeze: freeze: c1 = c@t+1; opcode = instr(15 downto 11)@t, // decoding of regA = instr(10 downto 8)@t, // instruction assume: regB = instr( 7 downto 5)@t, // word at t: reset = ’0’; regD = instr( 4 downto 2)@t;

prove: assume: at t+1: c = prev(c) + 1; at t: opcode = ADD_op; // processor ready at t: c1 = c + 1; at t: not stall; // to execute end property; at t: not interrupt; // instruction ADD

          Figure 2. Simple ITL Property                           prove:
                                                                  at t: write_reg(regD, vreg(regA) + vreg(regB));
                                                                  [. . . ]
                                                                  end     property;

executable model that captures the entire behavior of the macro write_reg(i, res: unsigned): boolean := design: a simulator. foreach k in 0..7: In this section the generation of the ISS is described. First if (k = i) then we discuss the techniques applied during the verification, elsenext(vreg(k)) = res(15 downto 0); which are used to define an abstract state of the design, next(vreg(k)) = vreg(k); corresponding to the architectural state of a processor. Then, end if; in Section 4.2 it is shown how the property suite can be endend foreach; formulated to allow for an automatic translation into an macro; equivalent C++ program. The translation itself is presented Figure 3. Operation Property in Section 4.3.

4.1. Abstraction pipelined processor. In the freeze section of the property, the As stated in section 3.2, after having completed the instruction word is decomposed according to the specifica- verification, the properties form a model of the verified tion into the opcode and the addresses of source and target design. However, the equivalence between the properties registers. The assume part states that there is actually an and the design under verification (DUV) is not achieved ADD instruction, and that the processor is ready to execute by simply reimplementing the complex logic of the circuit it. Under these constraints it is proved that one time step in the verification language ITL. Instead, the properties are later the register file is updated with the correct value. The formulated in a compact and readable form by making use forwarding logic of the pipeline is hidden in the mapping of abstraction techniques. function vreg which represents the architectural register file. The view of the DUV that is expressed by the properties Due to completeness requirements, the property also needs is a high-level operation view. For a processor, an operation to claim that the remaining registers will not change their naturally corresponds to the execution of a single instruction. value—this is included in the write_reg macro—and it Thus, each property describes the change of the internal must specify the output behavior of the processor. The latter state and the behavior of the output signals, when the statements are omitted in the figure. processor executes an instruction. The state of the DUV is described in terms of a high-level or architectural state, 4.2. Architectural Style Properties which corresponds to a programmer’s view on the visible registers of the design. This abstraction is achieved by the The basis for an ISS is an architecture description; such use of mapping functions. a description mainly consists of an architectural state and As an example, consider the register file of a pipelined a next state function describing the effect of each of the processor. The behavior of the implementation registers in instructions and interrupts on this state. For a set of oper- the design may depend on several instructions that are cur- ation properties as illustrated in Figure 3, the architecture rently processed by the pipeline. Consequently, the mapping description is rather implicit; hence, a generic and fully function that links the architectural state of the register file automatic extraction from the property would be quite hard. to the implementation captures the forwarding logic of the We therefore focus on reformulating the properties in a so- pipeline. Note that these mapping functions are still much called architectural style which allows for a straightforward more compact than the implementation (see also [5]). generation of the ISS; also supporting operation properties By using these mapping functions as representatives of would be a possible extension of this work after showing the state of the design, the operation properties strongly the feasibility of the approach. resemble a high level specification. As an example, Figure 3 Note that the reformulated property is checked against the shows the ITL property for an ADD instruction of a simple RTL and the automatic gap-detection is executed as well.

   Hence, any possible discrepancy between the architectural                      macro next_state(state:             state_t;

and operation properties are identified automatically and, in iw: instruction_t): state_t := particular, the reformulated property set is still equivalent to case iw.opcode is the design. An architectural style verification is characterized when ADD_op => by: update( state.register(iw.regD),( state.register(iw.regA) + 1) Explicit modeling of the architectural state and the [. . . ] state.register(iw.regB) ) ); interfaces to memories or ports end case; 2) Explicit definition of a macro next_state capturing the end macro; effects of instructions and interrupts on the architec- tural state Figure 4. Next State Function 3) Explicit definition of macros that capture the behavior of the interfaces to memories and ports 4) Explicit definition of the architectural reset state CPU vstate ISA Note that if the verification has been carried out in architectural style to begin with, the ISS can be generated from the verification without any manual steps in between. T next_state The reformulation of operation properties does not require any new detailed consideration of the design’s behavior. The CPU’ vstate identification of the components of the architectural state and ISA’ the precise description of the semantics of the instructions is carried out in the verification phase, anyway. In fact, one Figure 5. Structure of Equivalence Proof of the most sophisticated parts of the verification is to find the appropriate mapping functions for the architectural state; these mapping functions are not needed at all in order to and then mapping the resulting architectural state ISA to the automatically generate the ISS. new architectural state ISA′ using next_state corresponds We only describe the main technical aspects of the to applying the transition relation T on the CPU and then formulation in an architectural style in the following. The mapping the resulting implementation state CP U ′ to the architectural state is established in terms of a user defined architectural state. Furthermore it has to be proved that the VHDL record data type. This record combines all parts interface signals of the design actually behave as captured of the architectural state. Typically, this includes a register by the interface macros, and that the implementation reset file as well as status flags and a program counter of the state of the design complies with the defined architectural processor. In the same way, new data types describing the reset state. interfaces to memories and ports are defined. One essential aspect for the modeling of the ISS in ITL 4.3. Generation of the ISS is the newly introduced keyword update, which allows for the explicit definition of a write access to an array or record The outcome of the reformulation in architectural style data structure. As an example, Figure 4 shows part of the is a single property capturing all of the behavior of the next_state function for a simple processor. Here, the variable verified design by making use of the next_state function (see state of type state_t is the record holding all the elements Figure 6). In other words, one obtains a formally checkable of the architectural state. In another record iw of type ISA description. This is the starting point for the generation instruction_t, the decoded fields of the current instruction of the C++ instruction set simulator. In this section we word are kept. Using this information, the repeated decoding discuss how the ISS is automatically generated by translating of the same instruction can be avoided in the simulator (see the property. Section 4.3). The return value of the macro next_state is the Note that for the generation of the ISS, it is not required architectural state modified by the execution of the current to carry out the full equivalence proof between ISA and RTL instruction. The execution itself is modeled in a case block, upfront or even to identify the mapping functions between stating that when the opcode refers to an ADD instruction, the the architectural state and the implementation state. Instead, register file of the current architectural state will be updated it is possible to develop the ISA at the beginning of the by the sum of the source operands. Hence, the next_state design process in ITL and use it for an early generation of functions forms the core of the ISA. the ISS. This ISS only needs to be generated again in case The intuition of the equivalence proof between RTL and the ISA is updated, for example because a bug is found in the ISA is shown in Figure 5. The dashed arrows represent the ISA during the formal verification. Therefore, a generated abstraction function vstate that maps the implementation ISS is already available when the verification starts; the full state of the CPU to the state of the ISA. The aim is to prove confidence that the ISS complies to the design is achieved that applying the mapping vstate to the implementation state additionally at the end of the verification.

 property   isa;                                              Due to the locality of most software—caused e.g. by loop
 freeze:                                                      constructs—this is a very efficient technique to decrease
 instr             = decode(instruction) @ t,                 simulation run-time.
 isa_state   = vstate @ t,                                            The generated C++ class forms the core functionality of
 isa_state_p = vstate @ t+1,
 nstate      = next_state(isa_state,     instr) @ t;          the ISS. Besides this, the user has to provide a suitable
 assume:                                                      wrapper, which  calls the generated public functions         of
 at t: not  stall; // ready to start instruction              the simulator class to trigger the execution of the single
                                                              instructions. Moreover, the wrapper is used to connect the
 prove:                                                       simulation core to peripheral components like external mem-
 at t+1:    isa_state_p = nstate;                             ories or buses. It is also possible to integrate the simulation
 [. . . ]
 end      property;                                           core with commercial simulation and debugging tools, as
                                                              has been done for the experiments that are discussed in the
             Figure 6. Architectural Style Property           next section.

                                                              5. Experimental Results
    The core of the ISS is a C++ class Sim. It contains all

of the code for the execution of instructions and it holds the For a first evaluation of the proposed method, an ISS architectural state. Now that the ITL description is already in was generated for a small pipelined processor. The CPU a form similar to a procedural language, the translation of the contains a total of 8 registers with 16 bit and a special functionality is straightforward. It comprises the following register for an interrupt return vector. It is implemented with steps: a 5 stage pipeline. The data memory is connected to the CPU 1) Generate public functions for next_state, decode and via a simple interface. The instruction set is made up of 7 the interface macros instructions, including logic, arithmetic, memory access and 2) Generate private functions for all remaining macros jump instructions. 3) Generate a member variable for the architectural state For comparison, an ISS for the processor was built using 4) Replace ITL/HDL data types and operations by C++ a commercial tool. This tool also provides a wrapper that types and operations supplies the simulation kernel with the instructions, as well 5) Replace update expressions by a direct array/struct as a graphical debugging interface. The ISS generated with overwrite in C++ our approach has also been integrated with this environment. In order to achieve a simulation performance comparable As for the commercial tool, the ISS showed an aver- to custom state-of-the-art simulators, several optimizations age performance of 0.22 million instructions per second are applied during the generation of the C++ code. As long (MIPS) using an interpretive approach, while a just-in-time as the bit size of the ITL/HDL data types is less than or compiled simulator achieved 14 MIPS. The ISS generated equal to the bit width of the hosting system, native data from the property suite showed a performance of 7 MIPS. types like unsigned integers are used. Only the remaining This shows that our method clearly outperforms interpretive large bit vectors are replaced by dedicated data structures. simulators—emphasizing the effectiveness of the optimiza- Similarly, all basic operations, like integer arithmetic and tion techniques described in Section 4.3—while it reaches simple logic operations, are mapped to the corresponding about 50% of the performance of a state-of-the-art JIT-CS native C++ operations. For more complex operations like bit simulation tool. slicing or bit rotation, as well as for operations on large bit As a second experiment, an ISS was generated for an vectors, a library with optimized functions that implement industrial processor design. The CPU contains a total of 64 the ITL/HDL operators is used. registers of 32 bit in multiple hardware contexts and offers a Furthermore, an analysis of the data dependencies in the number of advanced processor features. It is implemented as macro functions is performed in order to identify shared a 7 stage pipeline and connects to a data memory and a bus expressions. With this information, additional member vari- interface. It is capable of executing 88 different instructions ables can be inserted in the simulator class in order to hold based on the DLX instruction set architecture [13]. The temporary results and to cache intermediate results of the source code of the processor core adds up to about 10,000 computations, speeding up the simulation. lines of VHDL code, while the final reformulated property Finally, a technique similar to the just-in-time compiled suite has a size of 2,000 lines of ITL. The property suite simulation (see Section 3.1) is used. As described above, and its completeness were successfully checked against the there is a decode macro which decomposes the instruction processor design using [12] and thus can be considered a word into several bit fields according to the specification. correct and complete specification. These bit fields are stored in a record data type. By caching Like for the first experiment, another ISS was imple- the results of the decode function, it is possible to avoid mented using the commercial tool suite. It showed an the repeated instruction decoding during the simulation. average performance of 2.5 MIPS (just-in-time compiled

   Table 1. Performance of Different ISS                            in            Proceedings of the ACM SIGPLAN 2001 conference on
   Design  Interpretive       JIT-CS      Generated                 Programming language design and implementation (PLDI),
   P1         0.22 MIPS    14.0 MIPS       7.0 MIPS                 2001, pp. 321–331.
   P2                 -     2.5 MIPS       1.2 MIPS                 [2] G. Braun, A. Nohl, A. Hoffmann, O. Schliebusch, R. Leupers,
                                                                    and H. Meyr, “A universal technique for fast and flexible

simulation), while the ISS that was generated using our instruction-set architecture simulation,” IEEE Trans. on CAD, approach reached 1.2 MIPS. This confirms the results of vol. 23, no. 12, pp. 1625–1639, 2004. the first experiment, i.e. that the generated ISS shows a [3] R. Leupers, J. Elste, and B. Landwehr, “Generation of inter- performance comparable to modern custom made instruction pretive and compiled instruction set simulators,” in Proceed- set simulators. The results are summarized in Table 1, ings of the ASP-DAC, 1999. showing the simulation performance for the simple processor [4] K. Winkelmann, H.-J. Trylus, D. Stoffel, and G. Fey, “Cost- (P1) and the industrial design (P2) in terms of MIPS. efficient block verification for a UMTS up-link chip-rate We presume that the custom JIT-CS simulators are still coprocessor,” in DATE, vol. 1, 2004, pp. 162–167. faster than our solution due to technical issues. The com- [5] J. Bormann, S. Beyer, A. Maggiore, M. Siegel, S. Skalberg, mercial tools that were used to build the JIT-CS simulator are subject to a great amount of optimizations, while we T. Blackmore, and F. Bruno, “Complete formal verification provided here the basic methodology of ISS generation as a of TriCore2 and other processors,” in Design and Verification Conference (DVCon), 2007. proof of concept. Furthermore, the properties that the ISS is generated from reflect all hardware and pipeline effects that [6] M. Nguyen, M. Thalmaier, M. Wedler, J. Bormann, D. Stoffel, may not be included in a high level ISA description and that and W. Kunz, “Unbounded protocol compliance verification decrease the simulation performance. using interval property checking with invariants,” IEEE Trans. on CAD, vol. 27, no. 11, pp. 2068–2082, Nov. 2008. 6. Conclusions [7] A. Biere, A. Cimatti, E. Clarke, and Y. Zhu, “Symbolic model checking without BDDs,” in Tools and Algorithms for the In this paper, we have presented an approach to use a Construction and Analysis of Systems, ser. LNCS, vol. 1579. gap-free, i.e., complete property suite, formulated in an ar- Springer Verlag, 1999, pp. 193–207. chitectural style, to generate a C++ instruction set simulator. [8] A. Chattopadhyay, A. Sinha, D. Zhang, R. Leupers, G. As- The generation makes use of the fact that after the successful cheid, and H. Meyr, “Integrated verification approach during formal verification, the property suite forms an architectural ADL-driven processor design,” in IEEE International Work- model of the verified design and hence, the ISS is equivalent shop on Rapid System Prototyping (RSP), 2006, pp. 110–118. to the design by construction. By applying a number of [9] M. Schickel, V. Nimbler, M. Braun, and H. Eveking, “An optimizations during the generation of the C++ code, the efficient synthesis method for property-based design in formal performance of the resulting simulator is comparable to verification: On consistency and completeness of property- state-of-the-art commercial tools. sets,” in Advances in Design and Specification Languages for As formal verification is increasingly used for industrial Embedded Systems, S. Huss, Ed. Kluwer Acad. Publishers, designs, the presented method is a practical way to obtain 2006, pp. 163–182. a proven correct and efficient instruction set simulator. It [10] S. Weber, M. W. Moskewicz, M. Gries, C. Sauer, allows for an automatic adaptation of the simulator, when and K. Keutzer, “Fast cycle-accurate simulation and parts of the design or the specification need to be changed in instruction set generation for constraint-based descriptions a late phase of the design process. Using the formal property of programmable architectures,” in International Conference suite as a single source for the specification ensures correct on Hardware/Software Codesign (CODES), September 2004, results for early software development based on instruction pp. 18–23. [Online]. Available: http://www.gigascale.org/ set simulation. pubs/566.html [11] J. Bormann and H. Busch, “Method for determining the Acknowledgment quality of a set of properties, applicable for the verification and specification of circuits,” Patent, 2007, european Patent This research work was supported by the German Federal Application, publication number EP1764715. Ministry of Education and Research (BMBF) in the Project [12] OneSpin Solutions GmbH, Munich, Germany, OneSpin Veri- HERKULES under the contract number 01M3082. fication Solutions, http://www.onespin-solutions.com, 2009. References [13] P. M. Sailer, P. M. Sailer, and D. R. Kaeli, The DLX Instruction Set Architecture Handbook. San Francisco, CA, [1] E. Schnarr, M. Hill, and J. Larus, “Facile: a language USA: Morgan Kaufmann Publishers Inc., 1996. and compiler for high-performance processor simulators,”