About RunLocalAI | RunLocalAI

RunLocalAI exists to answer one question: can my hardware run this model, and how fast? Most AI coverage is about cloud APIs. This site focuses on the underserved local-inference niche — Ollama, LM Studio, llama.cpp, vLLM, KoboldCPP, MLX, ExLlamaV2 — and the real hardware operators actually own.

Every page is anchored to verifiable data: parameter counts, license terms, quantization sizes, real tokens-per-second measurements. The site does not publish vibes — it publishes numbers measured directly or sourced from named community contributors with reproduction notes.

What this site is for

The local-AI ecosystem changes weekly. Drivers shift, runtimes fork, model architectures land that nobody has benchmarked yet, and yesterday's “best 24 GB card” gets superseded by a used market that didn't exist last quarter. Most operator decisions sit at the intersection of three questions: what hardware do I have? what model do I want to run? what runtime + quant + context combination actually fits?

RunLocalAI is built around answering those three questions honestly. The will-it-run engine gives you the math instantly. The hardware verdicts give you the buyer-grade analysis with skip-warnings + used-market notes. The buyer guides tie those together for the most-asked operator situations.

What this site refuses to do

Recommend more expensive hardware than the workload actually needs. The buyer guides include “skip this if” sections by design.
Inflate scores or invent benchmark numbers. Anything we haven't measured ourselves is labeled by provenance: source-backed, community, extrapolated, or estimated, with the source named.
Push affiliate links over honest answers. Affiliate revenue is the byproduct, not the visible goal — see how we make money for the full disclosure.
Pretend local AI is the answer for every workload. Cloud frontier models are still better at some things; the site says so directly when relevant.

Our test hardware

The active first-party benchmark fleet is named on the benchmark protocol page and on each benchmark row. Published owner-measured rows currently include the RTX 5080 desktop capture set where the public evidence package meets the repeat-run gate. Additional rigs are used for spot-checks and community-benchmark reproduction only when the row names the exact hardware. The discipline rule is clear: a verdict only counts as “measured” on the rig where it was run, and that rig is named on the benchmark row.

What this means honestly: the site has a growing first-party corpus, not blanket first-party coverage across every hardware page. Verdicts on hardware without owner-run evidence (RTX 4090, M3 Ultra, RX 7900 XTX, etc.) are labeled exactly as what they are: derived from vendor spec sheets, community benchmark submissions (editorially reviewed before publication), reproduced public sources, or computed fit math. Each row carries a provenance badge - Measured here, Source-backed, Community, Extrapolated, Estimated - and no row gets to claim a higher tier than its source supports.

Cross-vendor coverage (NVIDIA CUDA + AMD ROCm + Apple MLX/Metal) on editorial verdicts reflects published spec sheets and aggregated community benchmark runs, not direct first-party measurement on each platform. We’re honest about that gap; the long-term plan is to widen the first-party rig fleet as the editorial budget allows. The methodology behind every measurement is documented at /methodology.

How content is created

The bylined operator on every page is fred-oline. AI assistance is used for drafting, structuring, and maintenance scans — never for replacing operator judgment on buyer-decision content. The operator reviews and approves every piece of editorial output before it ships. See the editorial policy for the full process and the editorial philosophy for the trust-first principles that govern how the site weighs commercial pressure against reader interest.

How the site makes money

Affiliate links (Amazon Associates tag fredoline-20, plus first-class US retailers like Newegg and B&H for hardware) and display advertising. The site discloses these relationships on every page that carries affiliate-bearing links — see How we make money for the full disclosure including how editorial decisions are insulated from advertiser pressure.

Contact + corrections

For corrections, factual disagreements, hardware tip-offs, or partnership inquiries: contact page. The site logs every correction it ships in the changelog — operator transparency is the durability mechanism.