Browse community-verified serving configurations.
RTX 4090 · vLLM · fp16
Throughput
3.8k tok/s
Latency
12ms
A100 · vLLM · fp8
6.2k tok/s
8ms
H100 · TGI · fp8
7.1k tok/s
6ms
GB10 · vLLM · nvfp4
8.4k tok/s
5ms