Skip to content
Metrics

Arithmetic Intensity

The ratio of FLOPs to bytes of memory traffic for an operation, used to determine whether a workload is compute-bound or memory-bandwidth-bound.

Definition

Arithmetic intensity (measured in FLOPs/byte) characterises how computationally dense an operation is relative to the memory it must load. Operations with low arithmetic intensity (e.g., a batch-size-1 linear layer in LLM decode) require many bytes from HBM per useful FLOP and are memory-bandwidth-bound. Operations with high arithmetic intensity (e.g., prefill matmuls with large batch) are compute-bound. The roofline model plots achievable throughput as a function of arithmetic intensity, making it the standard tool for identifying inference bottlenecks.

More Metrics terms