Pillar guides

Hand-written, deeply researched guides on running AI locally. No listicles, no hedged "X might be good for Y" filler — clear opinions backed by real benchmarks.

Available now

Can I run AI locally on my computer?

Tier-by-tier walkthrough from a 4 GB laptop to a 24 GB GPU. What runs, what doesn't, and the 5-minute Ollama starter path.

Use case
8 min read·Last verified 2026-05-07

Free AI tools that run on your computer

The honest roundup. Ollama, LM Studio, llama.cpp, GPT4All, Open WebUI, AnythingLLM, Continue.dev, Aider — what each does, what's free, what breaks first.

Tooling
9 min read·Last verified 2026-05-07

Local AI vs ChatGPT Plus

Capability, cost, privacy, latency — when local genuinely wins, when ChatGPT Plus stays cheaper. No fake save-$1000-a-year claims.

Concept
10 min read·Last verified 2026-05-07

Best hardware for running local AI models

Tier-by-tier hardware buying guide from $0 (CPU-only) to $4,000+ (dual 3090). What models fit at each tier, gotchas per tier.

Hardware
10 min read·Last verified 2026-05-07

Common local-AI setup mistakes

Ten anti-patterns operators hit. Q2_K coding quants, ignoring KV cache, NVLink-doesn't-pool-memory, Ollama on Windows AMD, driver auto-update, etc.

Tooling
10 min read·Last verified 2026-05-07

How much does local AI cost?

Hardware, electricity, time, opportunity cost — the four categories. When local saves money, when it doesn't. Honest ranges, no fake ROI.

Concept
10 min read·Last verified 2026-05-07

How to use AI in job applications ethically

What AI is good for in job search (drafting, tailoring, practice) and what it must not do (impersonate you, invent qualifications, mass-spam). The honest principles.

Use case
10 min read·Last verified 2026-05-07

Choosing a GPU for local AI in 2026

Tier-by-tier buying guide across NVIDIA RTX 50/40/30, AMD RX 7000/9000, Apple Silicon, and the used market. Honest verdicts per tier.

Hardware
10 min read·Last verified 2026-05-05

Will-It-Run methodology

The exact math behind our hardware compatibility predictions. KV cache, runtime overhead, bandwidth-based speed prediction, confidence levels.

Methodology
8 min read·Last verified 2026-05-05

Best GPU for ComfyUI

VRAM tiers for SDXL, Flux, multi-LoRA, and IPAdapter workflows. Where the 12 GB floor is real and where 24 GB starts mattering.

Hardware
9 min read·Last verified 2026-05-07

Best GPU for KoboldCpp

What KoboldCpp's split-GPU + RoPE-scaling architecture does to VRAM math. Honest tier picks for chat, RP, and long-context fiction work.

Hardware
8 min read·Last verified 2026-05-07

Best GPU for AI agents

Agents pin context windows + run multi-turn loops. The VRAM math changes vs single-turn chat. What 24 GB covers, where 32 GB pays off.

Hardware
9 min read·Last verified 2026-05-07

Best GPU for local OCR

OCR with vision-LLMs (Qwen2.5-VL, InternVL) vs traditional pipelines (Tesseract, PaddleOCR). Hardware fit per workload.

Hardware
8 min read·Last verified 2026-05-07

Best GPU for voice cloning

F5-TTS, XTTS-v2, and Coqui TTS hardware fit. What 8 GB covers for promo-clip workflows; when fine-tuning needs more.

Hardware
8 min read·Last verified 2026-05-07

Best upgrade from RTX 3060 12 GB

When 12 GB stops being enough + what the honest upgrade path looks like. Used 3090 vs new 4060 Ti 16 GB vs hold-and-wait.

Hardware
9 min read·Last verified 2026-05-07

Best local-AI setup for beginners

Realistic first-month setup. Hardware that works, software that installs cleanly, the order to learn things in.

Use case
10 min read·Last verified 2026-05-07

Best free local AI tools (2026)

Updated 2026 roundup. Ollama, LM Studio, llama.cpp, Open WebUI, Continue, Aider, ComfyUI — what's free, what's worth installing first.

Tooling
9 min read·Last verified 2026-05-07

Best AI PC for students

Honest budget tiers under $800, $1500, $2500. What runs at each tier; when a used 3090 makes sense vs a new mid-tier card.

Hardware
10 min read·Last verified 2026-05-07

In the queue

Guides we're writing next, in roughly priority order. Vote for one or suggest a topic at Contact support.

Quantization formats explained: GGUF, EXL2, AWQ, GPTQ, MLX
What Q4_K_M actually means, how mixed-precision quants work, why EXL2 is faster on NVIDIA, and which format to pick for which runner.
Planned
llama.cpp vs vLLM vs ExLlamaV2: when each one wins
Three runners, three philosophies, three different optimal use cases. Real benchmark comparisons and honest tradeoffs.
Planned
Local AI on Apple Silicon: the unified memory advantage
How M-series chips run LLMs, when MLX beats llama.cpp, and which Mac configurations are worth the price.
Planned
Local AI on AMD in 2026: the ROCm story
Where AMD has caught up, where it still trails, and which AMD configurations are worth buying for AI workloads.
Planned
Building a local coding assistant
Pick a model, pair it with an IDE/agent, configure it for your codebase. Real setup walkthrough.
Planned
Local RAG without the bloat
Minimal stack for retrieval-augmented generation entirely on your machine. No LangChain heroics required.
Planned