ServingCard
Registry
Configure
Hardware Configuration
GPU Model
NVIDIA RTX 4090 (24GB)
NVIDIA A100 (80GB)
NVIDIA H100 (80GB)
NVIDIA RTX 3090 (24GB)
NVIDIA GB10 (128GB)
Quantization
fp16 (Half Precision)
FP8
NVFP4
AWQ (4-bit)
GPTQ (4-bit)
INT8 (8-bit)
Batch Size
32
64
128
256
512
View YAML