Skip to content
STIMSMITH

BDD Constraint Solver

Concept WIKI v1 · 6/5/2026

A constraint solving mode available in the Synopsys VCS constraint solver that elaborates the entire solution space of a SystemVerilog randomize() call before selecting a value, then caches that solution space to accelerate subsequent calls. It is best suited to constrained-random problems of moderate size that are invoked many times, such as CPU opcode generation, and is one of two solver modes exercised by the VCS Constraint Profiler and by Hierarchical Constrained-Random Stimulus Generation flows.

Overview

The BDD Constraint Solver is a solving mode within the Synopsys VCS constraint solver that uses a Binary Decision Diagram (BDD) representation of a randomize() call's constraint system. It is the alternative to the default RACE solver mode and is exposed to users through the VCS constrained-random verification flow. Its defining behavior, performance profile, and applicability are documented in case studies of CPU/microcode stimulus generation, most notably the AMD microcode stimuli work published on Design & Reuse.

How It Works

Unlike incremental solvers that reason about constraints on the fly, the BDD solver first elaborates the entire solution space of a randomize() call before selecting a single solution. Once the solution space is built, the solver picks a value from it. The full solution space is then cached, which speeds up subsequent randomize() calls that pose the same problem (e.g., randomized the same way repeatedly in a CPU opcode generator). The trade-off is that the elaboration step itself can require a significant amount of memory and CPU time, because the full set of legal solutions must be constructed up front.

Memory and Performance Characteristics

  • Memory: Because the BDD solver materializes the entire solution space, it is typically the memory-dominant mode. Memory use is reported separately in the VCS Constraint Profiler because RACE memory use is generally not a limiting factor and is usually not measured for comparison.
  • Runtime: The cost of elaborating the solution space is amortized over many randomize() calls. It pays off when the same randomize() call is invoked repeatedly.
  • Architectural fit: The BDD solver works well for specific architectures, particularly when the randomize problem does not consume excessive memory and the same randomize() call is executed many times — a pattern that is common in CPU opcode generation.

Comparison with the RACE Solver

When a stimulus architecture was refactored from a single-class to a multiple-class layout (choosing an opcode category first, then randomizing within it), both solvers benefited, but the magnitudes differed:

  • RACE solver: 4x speedup moving from single-class to multiple-class.
  • BDD solver: 2x speedup moving from single-class to multiple-class.
  • The multiple-class layout also produced significantly better BDD-solver memory consumption than the single-class layout.

The performance gains are attributed to the smaller set of variables and constraints present in the multiple-class implementation, which reduces the size of the problem the solver must work on. In the published study, the multiple-class implementation had roughly 7x fewer constraints than the original single-class implementation.

Use Cases

The BDD solver is documented in the context of generating stimuli for complex instruction sets:

  • AMD microcode / x86 opcodes: The BDD solver mode is used to randomize x86 instructions, where the same opcode generation problem is solved many times and the amortization of the upfront solution-space elaboration is favorable.
  • General constrained-random verification: Any flow in which a randomize() problem of manageable size is called many times can benefit from the BDD solver's caching behavior.

Integration With Other Tools and Techniques

  • The VCS Constraint Profiler reports per-randomize() and per-partition CPU runtime and memory data. Memory profiling is highlighted as especially valuable when the BDD solver is enabled, because BDD elaboration is the main source of memory pressure; cumulative randomize CPU runtime, individual randomize CPU runtime, individual partition CPU runtime, and memory data are all reported.
  • The VCS 2009.12 release added a testcase extraction feature that automatically extracts the slowest partition from each randomize() call, which is useful for diagnosing BDD-solver hot spots.
  • The BDD solver is employed by Hierarchical Constrained-Random Stimulus Generation techniques, in which a higher-level randomize call (e.g., choosing an opcode category) gates a lower-level randomize call (e.g., the operands/encodings of that category), naturally creating the pattern of repeated, narrowly-scoped randomize() calls that the BDD solver serves efficiently.

When to Choose the BDD Solver

Based on the documented evidence, the BDD solver is the preferred mode when:

  1. The constraint problem is small enough that elaborating the full solution space fits in memory.
  2. The same randomize() problem is invoked many times, so the cached solution space is reused.
  3. The application domain (such as CPU opcode generation) calls for solving the same structured problem repeatedly across thousands of randomized instructions.

The default RACE solver is generally a better fit for one-shot, larger, or more varied randomize problems where the cost of building the full BDD solution space is not amortized.

CITATIONS

10 sources
10 citations
[1] The BDD solver elaborates the entire solution space of the randomize call before selecting a solution. Generating AMD microcode stimuli using VCS constraint solver
[2] Elaborating the entire solution space can take large amounts of memory, and the solver must spend some time elaborating the entire solution space. Generating AMD microcode stimuli using VCS constraint solver
[3] The solution space is cached to speed up subsequent randomization calls. Generating AMD microcode stimuli using VCS constraint solver
[4] The BDD solver works well for specific architectures, particularly if the randomize problem does not take excessive memory and the same randomize call occurs many times, as is often the case with CPU opcode generation. Generating AMD microcode stimuli using VCS constraint solver
[5] The VCS 2009.12 release provided a testcase extraction feature to extract the slowest partition from each randomize call automatically. Generating AMD microcode stimuli using VCS constraint solver
[6] The default RACE solver shows a 4x speedup, while the BDD solver shows a 2x speedup when moving from a single-class to a multiple-class architecture. Generating AMD microcode stimuli using VCS constraint solver
[7] Memory requirements were significantly better with the multiple-class architecture, and memory results were measured for the BDD solver only because RACE memory use is typically smaller and not a limiting factor. Generating AMD microcode stimuli using VCS constraint solver
[8] The multiple-class implementation had 7x fewer constraints than the original, allowing the solver to calculate solutions more efficiently. Generating AMD microcode stimuli using VCS constraint solver
[9] Randomizing instructions by first choosing the opcode category significantly simplified the problem; the constraint solver had a smaller problem to consider because only constraints specific to the opcode category were present. Generating AMD microcode stimuli using VCS constraint solver
[10] VCS Constraint Profile reports cumulative randomize CPU runtime, individual randomize CPU runtime, individual partition CPU runtime, and memory data; the memory data is particularly useful when the BDD solver is used. Generating AMD microcode stimuli using VCS constraint solver