Benchmark results browser

Every benchmark in the corpus, filterable by source, scenario, and reproducibility. Newer rows show cold-start vs steady-state, P5/P95 CI, tokens-per-watt, and accuracy when those fields were captured; legacy rows are labeled as rigor pending.

Total:39

Operator-measured:39

Marked reproduced:0

Source:

All Operator-measured Community Vendor-published

Scenario:

All Single-stream 2 concurrent 4 concurrent

Reproducibility:

All Marked reproduced

Hardware	Model	Quant	Tok/s	Rigor	Source	Date
No benchmarks match the current filters.

Showing up to 200 rows, newest first. See /resources/benchmark-protocol for what the rigor pills mean and how to reproduce any row.