RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /Vector Stores and Embeddings
  6. /Ch. 5
Vector Stores and Embeddings

05. Adding Documents

Chapter 5 of 18 · 15 min
KEY INSIGHT

ChromaDB accepts raw text and generates embeddings automatically when you provide an embedding function. You can let ChromaDB handle embedding generation or provide pre-computed embeddings. The automatic approach is simpler; the manual approach offers more control. ### Automatic Embedding ```python import chromadb from sentence_transformers import SentenceTransformer client = chromadb.PersistentClient(path="./chroma_db") # Create collection with embedding function collection = client.get_or_create_collection( name="docs", embedding_function=SentenceTransformer('all-MiniLM-L6-v2') ) # Add documents collection.add( documents=[ "How to reset a forgotten password", "Password reset not working after email change", "Contact customer support for account recovery", "Set up two-factor authentication" ], ids=["doc1", "doc2", "doc3", "doc4"], metadatas=[ {"category": "auth", "priority": "high"}, {"category": "auth", "priority": "high"}, {"category": "support", "priority": "medium"}, {"category": "security", "priority": "medium"} ] ) print(f"Collection count: {collection.count()}") ``` ### Manual Embedding When you want to reuse embeddings across systems or use a different embedding model: ```python import chromadb import numpy as np client = chromadb.PersistentClient(path="./chroma_db") collection = client.get_or_create_collection(name="docs_manual") # Pre-compute embeddings model = SentenceTransformer('all-MiniLM-L6-v2') docs = ["First document text", "Second document text"] embeddings = model.encode(docs) collection.add( documents=docs, embeddings=embeddings.tolist(), # ChromaDB needs list, not numpy array ids=["manual1", "manual2"] ) ``` Common failure: passing numpy arrays directly instead of converting to lists. ChromaDB's internal serialization expects Python lists. ```python # This fails: collection.add(embeddings=embeddings) # numpy array # This works: collection.add(embeddings=embeddings.tolist()) # list of lists ```

Local verification checkpoint

Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.

Local verification checkpoint

Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.

EXERCISE

Add 10 documents to a new collection. Include metadata for each document. Verify the count matches what you added.

← Chapter 4
ChromaDB Collections
Chapter 6 →
Similarity Search