GELU activation function
ConceptThe Gaussian Error Linear Unit (GELU) is a non-linear activation function defined as GELU(h) = h · Φ(h), where Φ is the standard Gaussian cumulative distribution function. In the DeepVerifier architecture, GELU is applied inside the two fully-connected layers of the transformer feed-forward sub-network.
First seen 6/12/2026
Last seen 6/12/2026
Evidence 1 chunks
Wiki v1
WIKI
GELU activation function
The Gaussian Error Linear Unit (GELU) is a non-linear activation function used inside transformer-style models. In the DeepVerifier architecture, GELU is the non-linearity inserted between the two linear projections of each transformer's feed-forward sub-network.
Definition
NEIGHBORHOOD
No graph connections found for this entity yet. It may appear in future ingestion runs.
explore full graph →RELATIONSHIPS
1 connectionsDeepVerifier uses the GELU activation function in its transformer feed-forward network layers.
LINKED ENTITIES
1 linksCITATIONS
4 sources4 citations — click to collapse
[1] GELU is defined as GELU(h) = h · ½ (1 + Φ(h/√2)) ≈ 0.5h(1 + tanh(√(2/π)·(h + 0.044715h³))), where Φ is the Gaussian CDF (Equation 5). DeepVerifier: Learning to Update Test Sequences for Coverage-Guided Verification
[2] GELU is applied as the non-linearity in the two-layer feed-forward network fc1 = GELU(X·W1 + b1), fc2 = GELU(fc1·W2 + b2), with feed-forward dimension 1024 (Equation 6). DeepVerifier: Learning to Update Test Sequences for Coverage-Guided Verification
[3] The DeepVerifier transformer has encoder and decoder components, each consisting of two transformer blocks, and the decoder uses masked self-attention to prevent attending to future tokens. DeepVerifier: Learning to Update Test Sequences for Coverage-Guided Verification
[4] DeepVerifier trains the GELU-based transformer for a sequence generation objective over instruction sequences and, after fine-tuning, for a supervised coverage-score prediction task. DeepVerifier: Learning to Update Test Sequences for Coverage-Guided Verification