HOW-TO · RAG
How to Use ChromaDB with LangChain for Vector Storage
Target environment
Ubuntu 24.04 · Ollama 0.4.x
PREREQUISITES
ChromaDB and LangChain installed
What this does
LangChain provides a high-level abstraction over ChromaDB through its Chroma vector store integration. This guide shows how to initialize the vector store, configure an Ollama-backed embedding model, load documents, and run a similarity search - all within a LangChain pipeline.
Steps
Create the ChromaDB vector store with an Ollama embedding function.
from langchain_ollama import OllamaEmbeddings from langchain.vectorstores import Chroma embed_model = OllamaEmbeddings(model="mxbai-embed-large") vectorstore = Chroma( collection_name="langchain_docs", embedding_function=embed_model, persist_directory="./langchain_chroma" ) print("Vector store created:", vectorstore._collection.name)Add documents to the vector store. Use text splitting for better retrieval granularity.
from langchain.text_splitter import RecursiveCharacterTextSplitter docs = [ "ChromaDB is an open-source vector database.", "LangChain connects LLMs with external data sources.", "Ollama runs large language models locally." ] text_splitter = RecursiveCharacterTextSplitter(chunk_size=100, chunk_overlap=20) split_docs = text_splitter.create_documents(docs) vectorstore.add_documents(split_docs) print("Documents added:", vectorstore._collection.count())Perform similarity search.
results = vectorstore.similarity_search("What is ChromaDB?", k=2) for doc in results: print("-", doc.page_content)Build a RAG chain with a retrievalQA node.
from langchain_ollama import ChatOllama from langchain.chains import RetrievalQA llm = ChatOllama(model="llama3.2") qa_chain = RetrievalQA.from_chain_type(llm=llm, retriever=vectorstore.as_retriever()) answer = qa_chain.invoke("How does ChromaDB integrate with LangChain?") print(answer["result"])
Verification
python3 -c "
from langchain.vectorstores import Chroma
from langchain_ollama import OllamaEmbeddings
embed = OllamaEmbeddings(model='mxbai-embed-large')
vs = Chroma(embedding_function=embed, persist_directory='/tmp/lc_test', collection_name='t')
vs.add_texts(['hello world'])
print('Search result:', vs.similarity_search('hello', k=1)[0].page_content)
"
# Expected: Search result: hello world
Common failures
- Ollama server not running. LangChain's OllamaEmbeddings sends HTTP requests to
localhost:11434. Start Ollama withollama servebefore running the script. - Embedding model mismatch. Using
mxbai-embed-largein LangChain but a different model in ChromaDB causes embedding dimension mismatches. Ensure the same model name is used everywhere. - Missing
langchain-ollamapackage. The integration is in a separate package. Install it explicitly withpip install langchain-ollama. - Persist directory locked. Opening the same persist directory from two processes simultaneously causes a lock error. Always use a single writer process.
- Wrong import path. LangChain moved Ollama integrations to
langchain_ollama. The oldlangchain.embeddingspath may not include Ollama. Use the explicitlangchain_ollamaimport. - Version mismatch - The installed package or runtime differs from the command shown; check the version first and rerun the smallest verification command.
- Local environment drift - Another service, virtual environment, model, or path is being used; print the active binary path and configuration before changing the guide steps.
Related guides
RELATED GUIDES