Skip to content
Hardware

HBM (High Bandwidth Memory)

3D-stacked DRAM technology used in data-centre GPUs, offering memory bandwidth 5–10× higher than GDDR at the cost of smaller capacity.

Definition

High Bandwidth Memory (HBM) stacks multiple DRAM dies vertically using through-silicon vias (TSVs) and connects them directly to the GPU die via a wide interposer interface. The result is dramatically higher bandwidth (3.35 TB/s on H100 SXM) compared to GDDR6 (≈600 GB/s), at the cost of higher manufacturing complexity and lower maximum capacity per stack. Because LLM decoding is heavily memory-bandwidth bound, HBM bandwidth is the dominant determinant of decode throughput for a given model size.

More Hardware terms