HOW-TO · RAG
How to Enable ChromaDB Persistence for Production
Target environment
Ubuntu 24.04 · Ollama 0.4.x
PREREQUISITES
ChromaDB running, persistent storage available
What this does
In-memory ChromaDB instances lose all data when the process exits. For production deployments, this guide explains how to configure disk-based persistence, ensure safe shutdown, manage backup, and handle permission issues that commonly arise in server environments.
Steps
Create a persistent client pointing to a dedicated directory.
import chromadb, os storage_path = "/opt/chromadb/data" os.makedirs(storage_path, exist_ok=True) client = chromadb.PersistentClient(path=storage_path) print("Persistence enabled at:", storage_path)Add data and confirm it survives a process restart.
col = client.get_or_create_collection("production_kb") col.add( ids=["prod-1"], documents=["Production RAG pipeline ready for traffic."], metadatas=[{"env": "prod", "version": "1.0"}] ) print("Documents persisted:", col.count())Wrap the client in a singleton to avoid multiple instances.
_client = None def get_chroma_client(): global _client if _client is None: _client = chromadb.PersistentClient(path="/opt/chromadb/data") return _clientPerform a graceful shutdown test. Start the script, stop it, and restart - data should remain.
# Graceful shutdown: ensure ChromaDB writes flush before exit import atexit def shutdown_hook(): # PersistentClient auto-flushes; explicit sync is not needed print("ChromaDB client shutting down gracefully.") atexit.register(shutdown_hook)Configure file permissions for the storage directory.
sudo mkdir -p /opt/chromadb/data sudo chown -R $(whoami):$(id -gn) /opt/chromadb/data chmod 755 /opt/chromadb/data
Verification
python3 -c "
import chromadb
c = chromadb.PersistentClient(path='/tmp/persistence_test')
col = c.get_or_create_collection('persist')
col.add(ids=['x'], documents=['survives restart'])
del c
c2 = chromadb.PersistentClient(path='/tmp/persistence_test')
col2 = c2.get_collection('persist')
print('After restart:', col2.count(), 'docs')
c2.delete_collection('persist')
"
# Expected: After restart: 1 docs
Common failures
- Data loss on unclean exit. ChromaDB's PersistentClient flushes on every write, but killing the process with
SIGKILLcan leave the write-ahead log in an inconsistent state. UseSIGTERMorpkill -15for graceful shutdown. - Storage path owned by root. If the directory is owned by root, a non-root process cannot write, causing silent failures or permission errors at query time. Fix with
chownas shown above. - Disk full. When disk space is exhausted, ChromaDB cannot flush writes. Monitor with
df -h /opt/chromadband alert at 80% usage. - Concurrent write from multiple processes. Two processes writing to the same persistent directory cause lock contention. Use a single writer process and route reads through it, or switch to client-server mode.
- Path not absolute. Relative paths like
./datawork in development but resolve differently depending on the working directory in production. Always use absolute paths. - Version mismatch - The installed package or runtime differs from the command shown; check the version first and rerun the smallest verification command.
- Local environment drift - Another service, virtual environment, model, or path is being used; print the active binary path and configuration before changing the guide steps.
Related guides
RELATED GUIDES