Meta
Meta
/Llama 3.2 1B Instruct

Quantizations

QuantQuantized bySizeDecodePrefillScoreActions
MLX Community
MLX Community
663.1 MB130.0 tok/s1,611.2 tok/sRuns great

Device Comparison

Results include trials with 4,096 input tokens and 1,024 output tokens only.

Decode / Prefill Speeds

1 device