Overview
Discrete event simulation is described in the evidence as the mechanism required by the UVM base class library to synchronize testbench tasks and processes. In the eUVM implementation, the discrete-event simulator schedules testbench-related tasks and events and is broadly partitioned into a Scheduler and a Task Executor. [C1]
The scheduler manages event queues and schedules triggered tasks for execution. In a single-threaded simulator, the task executor runs runnable tasks sequentially on one CPU thread of the host machine. [C2]
Event scheduling model
The eUVM scheduler model shown in the evidence includes processing of signal values and immediate events, checking for delta events, pushing triggered tasks to a runnable process queue, incrementing simulation time, and checking for timed events before ending when no timed events remain. [C3]
This model separates the decision of what is ready to run from the execution of runnable work. The scheduler is responsible for event-queue management, while the task executor performs task execution. [C2]
Tasks and cooperative threading
In the UVM testbench context described by the evidence, simulator tasks include run phases of UVM components, body methods of sequences, and spawned forks. Most simulators implement these tasks as user threads, also described as cooperative threading. [C4]
When a task encounters a blocking function such as start_item or finish_item, simulation control can move to another testbench component, such as a driver. The sequence body task then yields, sleeps, and is later awakened after the driver processes the sequence item and sends a notification through item_done. [C5]
Context switching cost
Context switching is identified as an essential part of cooperative threading. When a task yields while waiting for an event, it relinquishes control of the CPU thread so another scheduled task can execute. Before yielding, the task saves state such as the call stack and CPU registers in host memory so it can later resume from where it stopped. [C6]
The evidence also characterizes context switching as simulator runtime overhead that does not provide useful testbench functionality. It states that performance optimization should avoid frequent context switching by reducing simulation events. [C7]
Parallel and multicore execution
The evidence describes a multicore testbench simulator architecture in which a parallel simulator implements a stack of task executors, and each task executor receives its own CPU thread to execute its share of tasks. Synchronization barriers are required to keep task executors synchronized with the scheduler. [C8]
However, the evidence notes limits to scheduler parallelization. At a given simulation time, a testbench simulator may deal with only a small number of active events and processes, so parallelizing the scheduler may provide limited benefit. The scheduler is described as a sequential component, and synchronization barriers add overhead in multicore testbenches. [C9]
The evidence applies Amdahl’s Law to this performance discussion: the overall speedup from optimizing one part of a system is limited by the fraction of time that part is used. [C10]
When parallelism may help
The evidence indicates that multicore execution can still help when tasks are comparatively compute-intensive. A cited testbench scenario involves multiple UVM agents or Verification IPs, where sequence randomization can be compute-intensive because it involves solving complex constraints. eUVM distributes sequence randomization across multiple threads by mapping each UVM agent to a separate CPU thread. [C11]