Vector Databases
A vector database stores embedding vectors and enables fast similarity search across millions of entries. The right choice depends on your scale, deployment constraints, filtering requirements, and whether you prefer a managed service or self-hosted infrastructure.
Pinecone — Managed, Production-Scale
Pinecone is a fully managed vector database designed for production workloads. It offers a serverless tier (pay per query) and pod-based deployments for predictable workloads.
- Strengths: Zero infrastructure management; scales automatically; fast queries even at hundreds of millions of vectors; strong metadata filtering; good SDK support
- Weaknesses: Vendor lock-in; data leaves your infrastructure (relevant for compliance); cost can be high at very large scale
- Best for: Teams that want to ship quickly and not manage vector infrastructure; production applications with millions of documents
- Pricing: Serverless free tier available; $0.04 per 1M read units on paid plans
from pinecone import Pinecone
pc = Pinecone(api_key="your-api-key")
index = pc.Index("my-rag-index")
# Upsert vectors
index.upsert(vectors=[
{"id": "doc-1-chunk-0", "values": embedding, "metadata": {"source": "policy.pdf", "text": "..."}}
])
# Query
results = index.query(vector=query_embedding, top_k=5, include_metadata=True)Weaviate — Open-Source + Multi-Modal
Weaviate is an open-source vector database with a managed cloud offering. It natively supports multi-modal data (text, images) and has strong hybrid search (vector + BM25) built in.
- Strengths: Open-source (self-host for free); native hybrid search; multi-modal support; GraphQL query interface; strong filtering
- Weaknesses: More complex to configure than Pinecone; GraphQL adds learning curve; self-hosted requires infrastructure management
- Best for: Teams that want open-source flexibility; use cases requiring hybrid search out of the box; multi-modal applications
Chroma — Lightweight, Local Development
Chroma is an open-source embedding database focused on developer experience. It runs in-memory or persists to a local directory — no separate server process needed for development.
- Strengths: Dead simple to get started; no server required (embedded mode); good Python SDK; integrates well with LangChain
- Weaknesses: Not designed for large-scale production; limited horizontal scaling; fewer advanced filtering options than Pinecone/Weaviate
- Best for: Prototyping, local development, small knowledge bases (<1M documents), learning RAG concepts
import chromadb
client = chromadb.PersistentClient(path="./chroma_db")
collection = client.get_or_create_collection("my_docs")
# Add documents
collection.add(
documents=["chunk text here", "another chunk"],
embeddings=[embedding1, embedding2],
metadatas=[{"source": "doc1.pdf"}, {"source": "doc2.pdf"}],
ids=["chunk-1", "chunk-2"]
)
# Query
results = collection.query(query_embeddings=[query_embedding], n_results=5)pgvector — Vector Search in Postgres
pgvector is a Postgres extension that adds a vector data type and similarity search operators. If you already use Postgres, you can add vector search without running a separate service.
- Strengths: No new infrastructure if you already use Postgres; combine vector search with relational queries in a single SQL statement; strong metadata filtering via SQL; ACID transactions across vector and relational data
- Weaknesses: Performance does not scale as well as dedicated vector databases at very high vector counts (>10M); no built-in managed vector service from Postgres (though Supabase and Neon offer hosted pgvector)
- Best for: Teams already on Postgres; applications needing relational + vector queries together; self-hosted deployments under ~5M vectors
Selection Criteria Summary
Choose based on scale
- Under 100k vectors: Chroma (local) or pgvector
- 100k–10M vectors: any option; Pinecone or Weaviate managed for simplicity
- Over 10M vectors: Pinecone (managed) or Weaviate/Qdrant (self-hosted)
Choose based on constraints
- Data must not leave your infra: pgvector, Chroma, or self-hosted Weaviate/Qdrant
- Existing Postgres: pgvector first
- Fastest to ship: Pinecone (managed, no ops)
- Hybrid search (vector + keyword): Weaviate or Qdrant
Checklist: Do You Understand This?
- Pinecone: managed, serverless, fastest to ship — best for production scale without ops overhead
- Weaviate: open-source, native hybrid search, multi-modal — best for complex filtering and self-hosted teams
- Chroma: embedded, no server required — best for prototyping and local development
- pgvector: Postgres extension — best if you already run Postgres and want relational + vector queries together
- Scale rule of thumb: Chroma/pgvector for <1M vectors; Pinecone/Weaviate for production scale