Skip to content
Exercises/KV Cache Sizing Calculator
🗄️Ch 5beginner

KV Cache Sizing Calculator

Calculate KV cache memory for different models, sequence lengths, batch sizes, and precisions.

Model

4K tokens
1

KV Cache Precision

Total KV Cache

1.34 GB

1 batch × 4,096 tokens

Per Token

320.0 KB

across 80 layers

Per Request

1342.2 MB

4,096 tokens

Architecture Details

80 layers × 8 KV heads × 128 head dim

2 (K+V) × FP16 = 4096 bytes/token/layer