Docs

whatcani.run is a platform for finding the best models and how to run them locally, based on real benchmark data. The goal is to simplify the process of picking and running a model locally and help answer questions like:

  • What model should I use for <task>? What quantization/format/runtime?
  • How should I configure the model?
  • What performance will I get?
  • What device should I buy/use?
  • How do devices compare against each other on <metric>?

How it Works

  1. People run and submit benchmarks via the CLI.
  2. Results are sanity-checked and approved/rejected.
  3. Stats are aggregated across models/devices and displayed here.

Submitting Benchmarks

Benchmarks are submitted via the CLI:

Install whatcanirun CLI
bun install -g whatcanirun # or `bunx whatcanirun`

To submit a benchmark, run whatcanirun for interactive mode or:

Submit a benchmark
whatcanirun run --model $MODEL_PATH_OR_HF_REFERENCE --runtime $RUNTIME --submit

If the model is not found on your device, it'll be automatically downloaded from Hugging Face. Note that llama.cpp and MLX are currently the only runtimes supported. See the README.md for more details.

You can optionally link your runs to an account by logging in here, then running whatcanirun auth login.

Contributing

The project is open-source. Any contributions are welcome. See the README.md for local development setup.

Contact

If you have any questions, ideas, issues, or requests, DM @fiveoutofnine.