HOW-TO · RAG
How to Create and Manage ChromaDB Collections
Target environment
Ubuntu 24.04 · Ollama 0.4.x
PREREQUISITES
ChromaDB installed
What this does
Collections in ChromaDB are named containers that store documents, their embeddings, and associated metadata. This guide covers creating, listing, updating, deleting, and inspecting collections so you can organize your vector data effectively in production pipelines.
Steps
Create a persistent client and a named collection.
import chromadb client = chromadb.PersistentClient(path="./chroma_collections") col = client.get_or_create_collection(name="knowledge_base") print("Created:", col.name)List all collections in the client.
collections = client.list_collections() print("Collections:", [c.name for c in collections])Add documents with metadata to the collection.
col.add( ids=["article-1", "article-2", "article-3"], documents=[ "Vector databases store data as numerical vectors.", "ChromaDB supports similarity search on embeddings.", "RAG pipelines combine retrieval with generative models." ], metadatas=[ {"category": "database", "author": "ops"}, {"category": "database", "author": "ops"}, {"category": "ai", "author": "ml"} ] ) print("Total documents:", col.count())Inspect collection details.
print("Name:", col.name) print("Metadata:", col.metadata)Delete a specific collection.
client.delete_collection(name="knowledge_base") print("Deleted. Remaining:", [c.name for c in client.list_collections()])Modify collection data with upsert. Replaces existing documents with matching IDs.
col.upsert( ids=["article-1"], documents=["Updated: Vector databases store data as numerical vectors for fast similarity search."], metadatas=[{"category": "database", "author": "ops", "updated": True}] )
Verification
python3 -c "
import chromadb
c = chromadb.PersistentClient(path='/tmp/chroma_test')
col = c.get_or_create_collection('test_col')
col.add(ids=['x'], documents=['hello world'])
print('Count:', col.count())
c.delete_collection('test_col')
print('After delete:', len(c.list_collections()))
"
# Expected: Count: 1
# Expected: After delete: 0
Common failures
- Duplicate ID error. Adding a document with an ID that already exists throws an error. Use
upsertinstead ofaddwhen you want to overwrite or use unique IDs with a prefix or timestamp. - Collection not found on delete. Calling
delete_collectionon a non-existent name raises aValueError. Useget_or_create_collectionpattern or check existence first withclient.list_collections(). - Metadata type mismatch. ChromaDB stores metadata as key-value pairs; values must be strings, integers, floats, or booleans. Nested dicts or lists as metadata values are not supported and cause runtime errors.
- Long collection names. Names with special characters or spaces work but can cause issues in some query interfaces. Stick to lowercase alphanumeric and hyphens.
- Count returns 0 after add. If the embedding function is misconfigured, documents may not be indexed. Verify the embedder is set correctly when creating the collection.
- Version mismatch - The installed package or runtime differs from the command shown; check the version first and rerun the smallest verification command.
- Local environment drift - Another service, virtual environment, model, or path is being used; print the active binary path and configuration before changing the guide steps.
Related guides
RELATED GUIDES