Original benchmark dataset

Local LLM benchmarks

Tokens-per-second measurements collected on owner hardware and from cited community sources. Every row ships with a confidence badge so you know which numbers to trust for purchasing decisions.

MMeasuredCCommunity~ExtrapolatedEEstimatedHow we measure →

Latest 3 runs

Sorted by date. Click a model or hardware name to drill into the full record.

ModelHardwareConf.QuantCtxTokens / secVRAMTTFTDate
Mixtral 8x7B InstructNVIDIA GeForce RTX 4090(Ollama)MQ4_K_M8K
31.4tok/s
23.1 GB248 msApr 23, 26
Llama 3.1 8B InstructNVIDIA GeForce RTX 4090(Ollama)MQ4_K_M8K
104.7tok/s
5.4 GB78 msApr 22, 26
Mistral 7B Instruct v0.3NVIDIA GeForce RTX 4090(Ollama)MQ4_K_M4K
112.3tok/s
5.1 GB64 msApr 22, 26