Google
Google
/Gemma 4 31B

Quantizations

QuantQuantized bySizeDecodePrefillScoreActions
MLX Community
MLX Community
16.3 GB10.2 tok/s70.5 tok/sBarely runs
MLX Community
MLX Community
31.4 GBN/AN/AN/A

Device Comparison

Results include trials with 4,096 input tokens and 1,024 output tokens only.

Decode / Prefill Speeds

4 devices