PrivateGPT
Air-gappable RAG over your docs. The OG offline-RAG project, now mature and team-friendly.
Editorial verdict: “Best when air-gap compliance is the requirement. Less polished than AnythingLLM, more configurable.”
Compatibility at a glance
Which runtime + OS combos this app works against. Source of truth for "will it run on my setup?"
What it is
PrivateGPT is the right choice when your compliance posture demands a truly air-gapped RAG pipeline. It bridges to Ollama, llama.cpp, or any OpenAI-compatible endpoint, and runs on Linux, macOS, or Windows with at least 8 GB VRAM — a Llama 3.1 8B Q4_K_M paired with bge-small-en is the baseline. The ingestion pipeline is mature and the backend is swappable (LLM, embedder, vector store), but the UI is functional rather than polished; setup takes more effort than AnythingLLM. If you need reproducible offline retrieval over your docs and can trade interface gloss for configurable control, this is the proven foundation.
✓ Strengths
- +Air-gap-friendly from day one
- +Swappable LLM + embedder + vector store
- +Active maintainer, large community
△ Caveats
- −UI is functional, not pretty
- −Setup is more involved than AnythingLLM
About the RAG app category
Document retrieval + chat, fully offline-capable.
Where to go from here
Pre-filled with this app's recommended use case + budget tier. Get the full rig + runtime + model picks.
The full directory — filter by category, runtime, OS, privacy posture, or VRAM.
What this app talks to: Ollama, vLLM, llama.cpp, MLX, LM Studio. The upstream layer.
Did this app work for you on a specific rig? Submit the benchmark — it powers the model + hardware pages.