Intermediate

Vector Databases

A vector database stores embedding vectors and enables fast similarity search across millions of entries. The right choice depends on your scale, deployment constraints, filtering requirements, and whether you prefer a managed service or self-hosted infrastructure.

Pinecone — Managed, Production-Scale

Pinecone is a fully managed vector database designed for production workloads. It offers a serverless tier (pay per query) and pod-based deployments for predictable workloads.

Strengths: Zero infrastructure management; scales automatically; fast queries even at hundreds of millions of vectors; strong metadata filtering; good SDK support
Weaknesses: Vendor lock-in; data leaves your infrastructure (relevant for compliance); cost can be high at very large scale
Best for: Teams that want to ship quickly and not manage vector infrastructure; production applications with millions of documents
Pricing: Serverless free tier available; $0.04 per 1M read units on paid plans

from pinecone import Pinecone

pc = Pinecone(api_key="your-api-key")
index = pc.Index("my-rag-index")

# Upsert vectors
index.upsert(vectors=[
    {"id": "doc-1-chunk-0", "values": embedding, "metadata": {"source": "policy.pdf", "text": "..."}}
])

# Query
results = index.query(vector=query_embedding, top_k=5, include_metadata=True)

Weaviate — Open-Source + Multi-Modal

Weaviate is an open-source vector database with a managed cloud offering. It natively supports multi-modal data (text, images) and has strong hybrid search (vector + BM25) built in.

Strengths: Open-source (self-host for free); native hybrid search; multi-modal support; GraphQL query interface; strong filtering
Weaknesses: More complex to configure than Pinecone; GraphQL adds learning curve; self-hosted requires infrastructure management
Best for: Teams that want open-source flexibility; use cases requiring hybrid search out of the box; multi-modal applications

Chroma — Lightweight, Local Development

Chroma is an open-source embedding database focused on developer experience. It runs in-memory or persists to a local directory — no separate server process needed for development.

Strengths: Dead simple to get started; no server required (embedded mode); good Python SDK; integrates well with LangChain
Weaknesses: Not designed for large-scale production; limited horizontal scaling; fewer advanced filtering options than Pinecone/Weaviate
Best for: Prototyping, local development, small knowledge bases (<1M documents), learning RAG concepts

import chromadb

client = chromadb.PersistentClient(path="./chroma_db")
collection = client.get_or_create_collection("my_docs")

# Add documents
collection.add(
    documents=["chunk text here", "another chunk"],
    embeddings=[embedding1, embedding2],
    metadatas=[{"source": "doc1.pdf"}, {"source": "doc2.pdf"}],
    ids=["chunk-1", "chunk-2"]
)

# Query
results = collection.query(query_embeddings=[query_embedding], n_results=5)

pgvector — Vector Search in Postgres

pgvector is a Postgres extension that adds a vector data type and similarity search operators. If you already use Postgres, you can add vector search without running a separate service.

Strengths: No new infrastructure if you already use Postgres; combine vector search with relational queries in a single SQL statement; strong metadata filtering via SQL; ACID transactions across vector and relational data
Weaknesses: Performance does not scale as well as dedicated vector databases at very high vector counts (>10M); no built-in managed vector service from Postgres (though Supabase and Neon offer hosted pgvector)
Best for: Teams already on Postgres; applications needing relational + vector queries together; self-hosted deployments under ~5M vectors

Selection Criteria Summary

Choose based on scale

Under 100k vectors: Chroma (local) or pgvector
100k–10M vectors: any option; Pinecone or Weaviate managed for simplicity
Over 10M vectors: Pinecone (managed) or Weaviate/Qdrant (self-hosted)

Choose based on constraints

Data must not leave your infra: pgvector, Chroma, or self-hosted Weaviate/Qdrant
Existing Postgres: pgvector first
Fastest to ship: Pinecone (managed, no ops)
Hybrid search (vector + keyword): Weaviate or Qdrant

Checklist: Do You Understand This?

Pinecone: managed, serverless, fastest to ship — best for production scale without ops overhead
Weaviate: open-source, native hybrid search, multi-modal — best for complex filtering and self-hosted teams
Chroma: embedded, no server required — best for prototyping and local development
pgvector: Postgres extension — best if you already run Postgres and want relational + vector queries together
Scale rule of thumb: Chroma/pgvector for <1M vectors; Pinecone/Weaviate for production scale