Meta
Meta
/Llama 3.2 3B Instruct

Quantizations

QuantQuantized bySizeDecodePrefillScoreActions
MLX Community
MLX Community
1.7 GB68.4 tok/s652.0 tok/sRuns well

Device Comparison

Results include trials with 4,096 input tokens and 1,024 output tokens only.

Decode / Prefill Speeds

4 devices