| mlx_lm | 213.9 tok/s | 2,913.2 tok/s | 1,443 ms | 2.20 GB | 202 runs | |
| mlx_lm | 168.6 tok/s | 2,202.7 tok/s | 1,869 ms | 2.56 GB | 727 runs | |
| mlx_lm | 151.6 tok/s | 1,895.0 tok/s | 2,282 ms | 2.13 GB | 301 run | |
| mlx_lm | 130.0 tok/s | 1,611.2 tok/s | 2,603 ms | 2.13 GB | 301 run | |
| mlx_lm | 113.5 tok/s | 1,053.3 tok/s | 3,662 ms | 2.58 GB | 301 run | |
| llama.cpp | 108.9 tok/s | 2,757.2 tok/s | 1,600 ms | 1.26 GB | 202 runs | None |
| mlx_lm | 96.7 tok/s | 1,095.4 tok/s | 3,704 ms | 3.14 GB | 301 run | |
| llama.cpp | 91.7 tok/s | 3,180.2 tok/s | 1,288 ms | 0.68 GB | 11 run | |
| llama.cpp | 74.7 tok/s | 2,288.3 tok/s | 1,970 ms | 0.89 GB | 101 run | None |
| llama.cpp | 72.1 tok/s | 2,228.2 tok/s | 1,980 ms | 0.83 GB | 101 run | None |
| mlx_lm | 68.4 tok/s | 652.0 tok/s | 6.06 sec | 3.43 GB | 301 run | |
| mlx_lm | 59.7 tok/s | 401.8 tok/s | 10.07 sec | 4.64 GB | 301 run | |
| mlx_lm | 55.9 tok/s | 380.4 tok/s | 10.70 sec | 21.00 GB | 312 runs | |
| mlx_lm | 44.5 tok/s | 352.2 tok/s | 12.01 sec | 13.00 GB | 301 run | |
| mlx_lm | 39.9 tok/s | 290.2 tok/s | 14.29 sec | 4.08 GB | 301 run | |
| mlx_lm | 30.6 tok/s | 198.6 tok/s | 20.75 sec | 7.08 GB | 301 run | |
qwen3-4b-instruct-2507-q8_0.gguf | llama.cpp | 30.4 tok/s | 495.6 tok/s | 8.85 sec | 5.02 GB | 301 run | None |
| llama.cpp | 24.4 tok/s | 386.4 tok/s | 10.67 sec | 3.35 GB | 101 run | None |
| mlx_lm | 7.5 tok/s | 38.7 tok/s | 106.08 sec | 18.00 GB | 301 run | |