HOW-TO · RAG
How to Set Up ChromaDB from Scratch
Target environment
Ubuntu 24.04 · Ollama 0.4.x
PREREQUISITES
Python 3.10+, pip
What this does
This guide explains how to install and initialize ChromaDB on a fresh system. You will install the package, create an in-memory database, add your first collection, and persist data to disk. ChromaDB is the default vector store for LangChain integrations and supports embedding-based retrieval out of the box.
Steps
Install ChromaDB via pip.
pip install chromadb # Expected: Successfully installed chromadb-0.5.xCreate an in-memory instance for quick prototyping.
import chromadb # In-memory - no data persists after the process exits client = chromadb.Client() print("Client created:", client)Create a persistent client that saves to disk.
client = chromadb.PersistentClient(path="./chromadb_data") print("Persistent client ready at ./chromadb_data")Create your first collection. Collections are namespaces for related documents. Each collection uses a default embedding function unless you override it.
col = client.get_or_create_collection(name="my-first-collection") print("Collection count:", col.count())Add documents and verify they are stored.
col.add( ids=["doc-1", "doc-2"], documents=[ "ChromaDB is a fast open-source vector database.", "It supports embedding-based similarity search." ], metadatas=[{"source": "readme"}, {"source": "docs"}] ) print("Stored docs:", col.count())Query the collection.
results = col.query( query_texts=["What is ChromaDB?"], n_results=1 ) print(results["documents"][0][0])
Verification
python3 -c "
import chromadb
c = chromadb.Client()
col = c.get_or_create_collection('test')
col.add(ids=['a'], documents=['hello'])
print('Docs:', col.count())
"
# Expected: Docs: 1
Common failures
- Permission error on disk path. Running as a non-root user on a root-owned directory fails. Use a path inside your home directory like
/home/user/chromadb_data. - Python version too old. ChromaDB 0.5+ requires Python 3.10. Check with
python3 --version; upgrade if needed. - Embedding function not found. When creating a collection without specifying an embedder, ChromaDB falls back to a default. If the default is unavailable, specify it explicitly with
embedding_function=YourEmbedFn(). - Port already in use. If running ChromaDB client-server mode, a port conflict causes startup failure. Check with
lsof -i :8000and kill conflicting processes. - Data directory corrupted. Deleting the data folder while the process runs can leave ChromaDB in an inconsistent state. Always shut down cleanly before removing the directory.
- Version mismatch - The installed package or runtime differs from the command shown; check the version first and rerun the smallest verification command.
- Local environment drift - Another service, virtual environment, model, or path is being used; print the active binary path and configuration before changing the guide steps.
Related guides
RELATED GUIDES