PrivateGPT

Name: PrivateGPT
Rating: 4.4 (1 reviews)

Fully offline

Air-gappable RAG over your docs. The OG offline-RAG project, now mature and team-friendly.

Editorial verdict: “Best when air-gap compliance is the requirement. Less polished than AnythingLLM, more configurable.”

RAG app

Free

Apache-2.0

★ 4.4 / 5

GitHub ★ 54,000

↗ Homepage ↗ GitHub ↗ Docs

Compatibility at a glance

Which runtime + OS combos this app works against. Source of truth for "will it run on my setup?"

§ Runtimes supported

ollamallama-cppopenai-compat

§ OS / platform

linuxmacoswindows

§ Hardware + model hint

Minimum VRAM

8 GB

Recommended starter model

Llama 3.1 8B Q4_K_M + bge-small-en embedder

→ Build the rest of the stack with /stack-builder → Pick a GPU for this app

What it is

PrivateGPT is the right choice when your compliance posture demands a truly air-gapped RAG pipeline. It bridges to Ollama, llama.cpp, or any OpenAI-compatible endpoint, and runs on Linux, macOS, or Windows with at least 8 GB VRAM — a Llama 3.1 8B Q4_K_M paired with bge-small-en is the baseline. The ingestion pipeline is mature and the backend is swappable (LLM, embedder, vector store), but the UI is functional rather than polished; setup takes more effort than AnythingLLM. If you need reproducible offline retrieval over your docs and can trade interface gloss for configurable control, this is the proven foundation.

✓ Strengths

+Air-gap-friendly from day one
+Swappable LLM + embedder + vector store
+Active maintainer, large community

△ Caveats

−UI is functional, not pretty
−Setup is more involved than AnythingLLM

About the RAG app category

Document retrieval + chat, fully offline-capable.

§ Other rag app apps

Khoj

Best 'AI second brain' app. Self-hosted, local-first, works against Obsidian.

Verba

Best for 'don't make me choose chunking strategy' teams. Opinionated stack works.

Where to go from here

Stack Builder →

Pre-filled with this app's recommended use case + budget tier. Get the full rig + runtime + model picks.

Back to /apps →

The full directory — filter by category, runtime, OS, privacy posture, or VRAM.

Runtimes (/tools) →

What this app talks to: Ollama, vLLM, llama.cpp, MLX, LM Studio. The upstream layer.

Community benchmarks →

Did this app work for you on a specific rig? Submit the benchmark — it powers the model + hardware pages.