NVIDIA
NVIDIA
/Nemotron 3 Super

Quantizations

QuantQuantized bySizeDecodePrefillScoreActions
Moring Labs
Moring Labs
50.9 GB43.3 tok/s322.1 tok/sRuns ok

Device Comparison

Results include trials with 4,096 input tokens and 1,024 output tokens only.

Decode / Prefill Speeds

1 device