Constraint Solver Wiki

Overview

In the provided evidence, a constraint solver is discussed in the context of Synopsys VCS and SystemVerilog constrained-random verification. It is used to randomize microcode instruction and opcode objects subject to constraints, weights, and legal combinations of instruction attributes. SystemVerilog constraint-language constructs provide a concise way to describe possible attribute combinations and control the distribution of individual fields, while the VCS constraint solver applies generator-layer weights to control the distribution of opcode types. [1]

Role in opcode stimulus generation

The described opcode generator uses a two-layer architecture. The upper layer is a SystemVerilog random sequence with weighted knobs that control high-level stimulus distribution. The lower layer is an opcode class randomized with additional constraints and weights from the upper layer. Tests provide weighted values that direct the required instruction mix, and the constraint solver applies those weights during generation. [2]

This use of a constraint solver addresses a limitation of traditional serial field randomization: randomizing instruction fields one after another can produce skewed distributions and gives limited control over the overall stimulus distribution. A constrained-random approach can improve distribution control, but if the modeled instruction set is complex, the resulting randomization problem can become too large and slow. [3]

Constraint problem size and architecture

A single-class opcode model is flexible because constraints can be applied across any data members in the opcode class. However, the evidence reports that this style may be slow because the solver receives many random variables and a large, complex constraint set; the cited opcode class contained about 100 random variables and 800 constraint equations. [4]

The evidence describes an object-oriented, hierarchical alternative: a base class contains global constraints shared by all opcodes, while derived subclasses define groups of related opcodes with similar constraints. Partitioning the constraints into smaller opcode groups reduced memory requirements and improved performance. [5]

A key optimization is to randomize the opcode category first, so the solver only sees constraints relevant to that category. In the reported comparison, the newer multiple-class implementation had seven times fewer constraints than the original single-class implementation, allowing the solver to compute solutions more efficiently. [6]

Solver modes and profiling

The evidence identifies two VCS solver modes used in the comparison: the default RACE solver and a BDD solver. The BDD solver elaborates the entire solution space of a randomize call before selecting a solution. This can consume substantial memory and time, but the solution space is cached to accelerate later calls. The BDD solver is described as working well when the randomization problem does not require excessive memory and the same randomize call occurs many times, as in CPU opcode generation. [7]

VCS constraint profiling reports cumulative and individual randomize CPU runtime, individual partition CPU runtime, and memory data. The VCS 2009.12 release also provided testcase extraction for automatically extracting the slowest partition from each randomize call. [8]

Reported performance observations

In the reported AMD microcode stimulus-generation study, the multiple-class architecture was faster than the single-class architecture with both solver modes and for both tested opcodes. The default RACE solver showed a 4x speedup, while the BDD solver showed a 2x speedup. Memory requirements also improved significantly in the multiple-class architecture; the study measured BDD memory because RACE memory use was typically smaller and not the limiting factor. [9]

Practical guidance from the evidence

For constrained-random opcode generation, the evidence supports the following practices:

Keep each randomize problem as small as practical by partitioning constraints hierarchically.
Separate global opcode constraints from category-specific constraints.
Choose a high-level opcode category before randomizing detailed fields, so irrelevant constraints are not included in the solver problem.
Use solver profiling to identify slow randomize calls and expensive partitions.
Consider the BDD solver when repeated calls can benefit from cached solution spaces, but account for its up-front memory and elaboration cost. [6][7][8]