Skip to content

Benchmark Comparison

Data as of 2025-12-05

Compare vLLM vs SGLang performance on DeepSeek-R1-Distill-Llama-8B across workloads and concurrency levels. Click any row to expand details.

Source: vllm-vs-sglang-performance-benchmark — 2x H100 SXM, TP=2, CUDA 12.9, 5,980 total requests

Metric:
Framework:
Model:
GPU:
Workload:
18 results
Tokens / Second