AnythingLLM

Hybrid (offline or cloud)

Docs-aware chat with workspaces. Drop a folder of PDFs, get a working RAG chatbot in 5 minutes.

Editorial verdict: “Best fast-RAG app. Workspace model is the right abstraction for doc-corpora chat.

Chat UI
Free
MIT
4.5 / 5
GitHub ★ 34,000

Compatibility at a glance

Which runtime + OS combos this app works against. Source of truth for "will it run on my setup?"

§ Runtimes supported
ollamalm-studioopenai-compat
§ OS / platform
macoslinuxwindows
§ Hardware + model hint
Minimum VRAM
8 GB
Recommended starter model
Llama 3.1 8B Q4_K_M + bge-m3 embedder

What it is

AnythingLLM is built around 'workspaces' — each one is a chat + a knowledge base + a model config. Drop a PDF folder, the app chunks and embeds it locally (or via OpenAI), and you can chat against it. Talks to Ollama, LM Studio, local embedding models, and many cloud providers. The fastest path from 'I have a folder of docs' to 'I have a chatbot for that folder.'

✓ Strengths

  • +Workspace abstraction is genuinely well-designed
  • +Local + cloud embedding options
  • +Citations link back to the source doc passages

△ Caveats

  • Default Docker config consumes a lot of disk for embeddings
  • Best perf needs a separate embedding-model service