What this does

LangChain provides a high-level abstraction over ChromaDB through its Chroma vector store integration. This guide shows how to initialize the vector store, configure an Ollama-backed embedding model, load documents, and run a similarity search - all within a LangChain pipeline.

Steps

Create the ChromaDB vector store with an Ollama embedding function.

from langchain_ollama import OllamaEmbeddings
from langchain.vectorstores import Chroma

embed_model = OllamaEmbeddings(model="mxbai-embed-large")

vectorstore = Chroma(
    collection_name="langchain_docs",
    embedding_function=embed_model,
    persist_directory="./langchain_chroma"
)
print("Vector store created:", vectorstore._collection.name)

Add documents to the vector store. Use text splitting for better retrieval granularity.

from langchain.text_splitter import RecursiveCharacterTextSplitter

docs = [
    "ChromaDB is an open-source vector database.",
    "LangChain connects LLMs with external data sources.",
    "Ollama runs large language models locally."
]

text_splitter = RecursiveCharacterTextSplitter(chunk_size=100, chunk_overlap=20)
split_docs = text_splitter.create_documents(docs)

vectorstore.add_documents(split_docs)
print("Documents added:", vectorstore._collection.count())

Perform similarity search.

results = vectorstore.similarity_search("What is ChromaDB?", k=2)
for doc in results:
    print("-", doc.page_content)

Build a RAG chain with a retrievalQA node.

from langchain_ollama import ChatOllama
from langchain.chains import RetrievalQA

llm = ChatOllama(model="llama3.2")
qa_chain = RetrievalQA.from_chain_type(llm=llm, retriever=vectorstore.as_retriever())
answer = qa_chain.invoke("How does ChromaDB integrate with LangChain?")
print(answer["result"])

Verification

python3 -c "
from langchain.vectorstores import Chroma
from langchain_ollama import OllamaEmbeddings
embed = OllamaEmbeddings(model='mxbai-embed-large')
vs = Chroma(embedding_function=embed, persist_directory='/tmp/lc_test', collection_name='t')
vs.add_texts(['hello world'])
print('Search result:', vs.similarity_search('hello', k=1)[0].page_content)
"
# Expected: Search result: hello world

Common failures

Ollama server not running. LangChain's OllamaEmbeddings sends HTTP requests to localhost:11434. Start Ollama with ollama serve before running the script.
Embedding model mismatch. Using mxbai-embed-large in LangChain but a different model in ChromaDB causes embedding dimension mismatches. Ensure the same model name is used everywhere.
Missing langchain-ollama package. The integration is in a separate package. Install it explicitly with pip install langchain-ollama.
Persist directory locked. Opening the same persist directory from two processes simultaneously causes a lock error. Always use a single writer process.
Wrong import path. LangChain moved Ollama integrations to langchain_ollama. The old langchain.embeddings path may not include Ollama. Use the explicit langchain_ollama import.
Version mismatch - The installed package or runtime differs from the command shown; check the version first and rerun the smallest verification command.
Local environment drift - Another service, virtual environment, model, or path is being used; print the active binary path and configuration before changing the guide steps.

How to Use ChromaDB with LangChain for Vector Storage

What this does

Steps

Verification

Common failures

Related guides