Community submitted(Moderated request market)All requests reviewed before public render

Request a benchmark

Tell us what you wish were measured. Specific (model × hardware × runtime) requests go through editorial review; accepted ones land on the public benchmark roadmap and become candidates for any operator who wants to claim them.

Read the request policy for what we accept and what we don't.

How requests work

1. Submit. Pick a model + hardware (required), optionally specify the runtime, quant, context, OS, use case. Tell us why you want this measurement (1-2 sentences).

2. Review. Editorial reviews within 1-7 days. We accept if the request is specific + plausible + not a duplicate. Rejected / duplicate requests stay private to editorial.

3. Claim. Accepted requests show up on /benchmarks/wanted with an “I can measure this” CTA. An operator who claims it signals intent; claiming creates no public credit by itself.

4. Measure. When a measurement lands and editorial confirms it matches the request, the request status moves to “measured” and links to the published benchmark.

What makes a good request

Specific + well-motivated requests get accepted faster.

Name the exact model + hardware. “Llama 3.1 70B on RTX 4090” is fine; “a coding model on a fast GPU” is not.
Say which runtime if you care. vLLM vs llama.cpp vs MLX produce meaningfully different numbers; specifying which one helps the operator claiming it.
Explain why. “I'm deciding between a 4090 and a 5090 and need to know if 70B Q4 fits comfortably” tells us what page this would unlock.
Don't request what already exists. Browse /benchmarks first — many setups are already covered.

Model *

Hardware *

Runtime

Quant format

Context (tokens)

Use case

Why do you want this measurement? *

Markdown. 20-5000 characters. Sanitized at render time.

Your name (optional)

URL (optional)

Email (optional)

Privacy

Email is optional. If you provide it, we'll only use it for moderator follow-up + to notify you when a measurement lands. Your email never appears publicly.

We hash your IP for rate-limiting + duplicate detection (max 3 requests per hour). Raw IPs are never stored. The hash includes a daily salt so it can't be used to track you across days.

Already have a measurement? Submit it directly via /submit/benchmark.