Every benchmark in the corpus, filterable by source, scenario, and reproducibility. Newer rows show cold-start vs steady-state, P5/P95 CI, tokens-per-watt, and accuracy when those fields were captured; legacy rows are labeled as rigor pending.
| Hardware | Model | Quant | Tok/s | Rigor | Source | Date |
|---|---|---|---|---|---|---|
| No benchmarks match the current filters. | ||||||
Showing up to 200 rows, newest first. See /resources/benchmark-protocol for what the rigor pills mean and how to reproduce any row.