Skip to content
Architecture

Beam Search

Deterministic decoding that maintains the top-K highest-probability partial sequences at each step, used in translation but rarely in modern LLM chat.

Definition

Beam search maintains B candidate sequences (beams) at each step, expanding each beam with the top tokens and keeping only the B sequences with the highest cumulative log-probability. It approximates the globally optimal sequence more closely than greedy decoding but requires B full forward passes per step, multiplying compute and memory requirements by B. Beam search is standard in machine translation and summarization tasks with short, well-defined outputs, but for open-ended generation, sampling methods tend to produce more diverse and natural-sounding text.

More Architecture terms