Skip to content
STIMSMITH

Address Translation

Concept

Address Translation is a hardware mechanism in computer architectures that maps addresses between different address spaces, including virtual-to-physical address mapping for processor memory accesses and Network Physical Addresses (NPAs) to System Physical Addresses (SPAs) in distributed GPU systems. It is implemented using structures such as Translation Lookaside Buffers (TLBs) and Memory Management Units (MMUs), and remains a critical performance factor for both CPU workloads and large-scale distributed machine learning.

First seen 6/13/2026
Last seen 6/13/2026
Evidence 2 chunks
Wiki v1

WIKI

Overview

Address Translation is one of the main hardware mechanisms described in a processor's architecture specification, alongside interrupt handling and multi-tasking [1]. In modern computer architectures it is treated as a complex functional unit whose behavior must be verified together with the rest of the processor design [1].

The purpose of address translation is to convert addresses produced by software (or by a remote peer) into the addresses actually used by the underlying memory or interconnect hardware. Two principal variants appear in contemporary systems: conventional data address translation used by CPUs, and reverse address translation used by destination nodes that receive remote memory accesses over scale-up fabrics such as NVLink or UALink.

READ FULL ARTICLE →

NEIGHBORHOOD

1 nodes · 0 edges
graph · Address Translation · depth=1

CITATIONS

6 sources
6 citations — click to expand
[1] Address translation is one of the main hardware mechanisms described in a processor architecture specification, treated as a complex functional unit. Test program generator - International Business Machines Corporation
[2] DTLB misses are dominated by a small subset of static loads, and a dynamic instance of a static load frequently accesses the same PTE as the prior dynamic instance. PC-Indexed Data Address Translation
[3] PCAX can cut the effective DTLB miss rate by a factor of 2-3X, reduce STLB misses, and yield an average 1.7% performance improvement while reducing data address translation energy by 7% across 84 server traces. PC-Indexed Data Address Translation
[4] Reverse Address Translation maps Network Physical Addresses (NPAs) to System Physical Addresses (SPAs) on the destination side of scale-up fabric accesses such as NVLink or UALink. Analyzing Reverse Address Translation Overheads in Multi-GPU Scale-Up Pods
[5] Cold TLB misses in Link TLBs can cause up to 1.4x performance degradation for small, latency-sensitive collectives, while larger collectives show diminishing returns from oversized TLBs. Analyzing Reverse Address Translation Overheads in Multi-GPU Scale-Up Pods
[6] Verification tasks for address translation include crossing page boundaries, triggering page faults, and exercising cache hit/miss scenarios on operand addresses. Test program generator - International Business Machines Corporation