Definition
A virtual sequence is a verification construct that coordinates stimulus across multiple DUT interfaces. In one cited UVM environment, each virtual sequence creates interface-specific transactions and sends them to the corresponding interface agent; the agent driver then stimulates the matching DUT sub-interface with the transaction values. Because the virtual sequence does not know exactly when a transaction is driven, a monitor captures the interface state and sends it back through the sequencer, allowing the virtual sequence to react and produce new stimulus. [C1]
In the cited Multi-Armed Bandit (MAB) verification formulation, a virtual sequence is defined more compactly as a collection of test sequences that drive each interface of the DUT. For the Instruction Fetch (IF) unit example, one virtual sequence contains four sequences in total: one sequence per IF-unit interface. [C2]
Role in a UVM verification environment
The cited VPU verification environment uses UVM to build a modular, scalable, and reusable verification environment. Each semi-independent sub-interface has its own agent, and each agent contains a sequencer, driver, and monitor connected to a virtual interface. Virtual sequences interact with those agents by creating transactions, receiving monitored interface state through the sequencer, and generating reactive stimulus. [C1]
Because the VPU environment includes seven unique sub-interfaces that communicate with one another, the implementation uses UVM events to keep virtual sequences synchronized. The evidence notes that UVM events can transmit data along with an event trigger, which eased intercommunication among virtual sequences. [C3]
Virtual sequences in MAB-driven coverage closure
In the MAB-based verification flow, virtual sequences are pre-selected and used as the "slot machines" available to the MAB policy. Each sequence is applied to the DUT for a number of cycles, and a reward is recorded from its coverage performance against selected functional properties. The MAB policy then recommends which sequence to select next, balancing exploitation of sequences that have increased coverage in the past with exploration of other sequences that may perform better. [C4]
The framework fixes both the set of virtual sequences and the sequences inside each virtual sequence before simulation. Their parameters are pre-selected using representative random parameter sampling, and the framework does not allow a selected sequence to change its parameters or constraints during simulation. This fixed behavior lets the MAB policy learn the performance of repeatedly played sequences and penalize poorly performing ones. [C5]
UCB1 selection behavior
The cited framework applies the selected virtual sequences using the UCB1 algorithm. In each trial, UCB1 selects a virtual sequence based on rewards observed in previous trials. The method first plays each of the K virtual sequences once for M cycles to initialize the mean payoff of each sequence. In later trials, it selects the sequence with the highest upper-confidence estimate, observes the trial reward, and updates the sequence's mean reward. [C6]
Only one virtual sequence is allowed to operate in each round; a non-selected virtual sequence remains frozen and contributes no reward during idle periods. The evidence also states that the selected virtual sequence operates without reseeding or parameter tuning. [C6]
Coverage reward
The reward used by the MAB framework is based on coverage progress. For active coverage bins, a sequence receives credit for bins it triggers at least once during a trial; for example, if four bins remain active and the sequence hits three of them, the reward is 3/4. The reward is kept in the range [0,1], and the cited work renormalizes it with a logarithmic function to boost small rewards and compress large ones, which is useful near the end of simulation when hard-to-cover bins remain. [C7]
Example: Instruction Fetch unit
In the IF-unit case study, the DUT subsystem is connected to the rest of the processor through four separate interfaces. During simulation, each interface is fed by a distinct test sequence that mimics the behavior of the corresponding connection when the processor executes real programs. The individual interface sequences are generated by constrained-random generators customized with interface-specific parameters. [C8]
To create virtual sequences for this case study, the authors selected parameter values for the constituent interface sequences and randomly selected K = 40 virtual sequences from the available choices. This number was chosen empirically to ensure that each possible parameter level was used at least once, while still allowing the verification engineer to choose specific parameter sets for targeted corner cases. [C9]
Reported effect in the cited MAB experiment
Using the same 40 virtual sequences, the cited experiment compared random application of sequences for 5000 trials against the MAB framework using UCB1 and the proposed reward function. Across five random seeds, random selection achieved 82.9% to 90.6% coverage after 5000 trials, while the MAB framework reached the same coverage goals in 1507 to 2590 trials, averaging 1988 trials and a reported 60% saving. The authors summarize this as about a 2× reduction in the number of trials needed to reach the same coverage goal. [C10]
The same evidence emphasizes that UCB1 does not improve the intrinsic quality of the selected sequence set; rather, it identifies the sequence set's potential more quickly by playing consistently higher-reward sequences more often while still exploring all sequences. [C11]