Intermediate

Graph RAG vs Vector RAG

Vector RAG finds similar text. Graph RAG follows connections. For simple lookups, vector search is faster, cheaper, and sufficient. For questions that require multi-hop reasoning, global synthesis, or relationship-based answers, vector search can't compete — and in some cases fails entirely. Knowing which to use (and when to combine them) is a core architectural decision.

The RAG Complexity Spectrum

Vector RAG

Similarity search — fast, cheap, good for simple lookups

Graph RAG

Relationship traversal — powerful for complex reasoning

Find docs about topic X

Summarise recent updates

Who worked on projects using library Y?

What are the main themes across 10,000 docs?

How They Differ

Vector RAG

Embeds documents as vectors; retrieves by cosine similarity
Finds text that looks like the query — lexical and semantic proximity
Each chunk is independent — no explicit relationships between items
Simple to set up; fast at query time; low indexing cost
Degrades on multi-hop: “who is the CEO's manager's direct report?” requires two hops that a single vector query can't traverse

Graph RAG

Stores entities and relationships in a graph; retrieves by traversal
Finds answers by following connections — relationship chains and patterns
Relationships are explicit — typed edges with direction and properties
Higher indexing cost (LLM extraction pipeline required)
Multi-hop queries are native: any depth, any relationship pattern

Performance Comparison

The performance gap between the two approaches is not marginal on complex tasks:

Task Type	Vector RAG	Graph RAG
Simple fact retrieval	Good — fast, sufficient	Overkill — adds latency for no gain
Semantic search (“find docs about X”)	Excellent	Can hurt precision (adds tangential context)
Multi-hop reasoning (2+ hops)	32% accuracy	86% accuracy
Queries involving 10+ entities	Accuracy → 0%	Above 70%
Global synthesis (“main themes across corpus”)	Cannot do this	Native (community summaries)
Schema-bound aggregation queries	0% accuracy	90% accuracy

Accuracy figures from Microsoft Research GraphRAG benchmarks (2024–2025). Graph RAG wins 70–80% of complex sensemaking tasks in head-to-head evaluations.

Microsoft GraphRAG Architecture

Microsoft Research released GraphRAG in 2024 to address the limitations of naive vector RAG on large corpora. Its key innovation is hierarchical community detection:

Extract — LLM extracts all entities and relationships from the corpus (expensive, done at indexing time)
Cluster — Leiden algorithm groups related entities into hierarchical communities
Summarise — LLM generates a summary for each community at multiple levels of the hierarchy
Query — Local search finds specific entities; global search traverses community summaries

The community summaries enable questions that no vector index can answer: “What are the major recurring themes across all 50,000 support tickets?” — because the answer isn't in any single document; it's in the aggregated structure.

Cost Tradeoffs

Full GraphRAG indexing cost

The original Microsoft GraphRAG approach generates LLM summaries for every community at indexing time. For a typical enterprise corpus this costs $20–$500in LLM API calls — and must be repeated whenever new data is added.

LazyGraphRAG (June 2025)

Microsoft's LazyGraphRAG variant defers community summarisation to query time, reducing indexing cost to under $5. The tradeoff: 2–8 additional seconds per query as summaries are generated on demand.

The Hybrid Pattern

The practical architecture for most production systems is neither pure vector RAG nor pure graph RAG — it's both, used in sequence:

1. Vector search

Find the most relevant starting nodes. Fast semantic similarity narrows a corpus of millions to tens of relevant entities.

2. Graph traversal

From the starting nodes, follow edges to collect related context — dependencies, relationships, co-references — that pure similarity search would miss.

3. LLM generation

Pass the combined context (vector results + graph traversal results) to the LLM. The model gets both semantic similarity and structured relationship context.

LlamaIndex's PropertyGraph index is the most common implementation: it stores vector embeddings alongside graph triples, enabling this hybrid retrieval pattern from a single index.

Decision Guide: Which to Use

Use vector RAG when

Questions are about specific topics or documents (“find me the docs about authentication”)
You need fast, cheap, scalable retrieval
Your data doesn't have strong relational structure
You're prototyping or the task is simple enough that vector similarity is sufficient

Use graph RAG when

Questions require multi-hop reasoning (“what X is connected to Y via Z?”)
You need global synthesis across the entire corpus
Relationships between entities are part of the answer, not incidental
Your domain has strong relational structure: healthcare pathways, fraud networks, codebases, org charts

Checklist: Do You Understand This?

Vector RAG finds similar text; Graph RAG follows typed relationship edges between entities
Graph RAG: 86% accuracy on multi-hop tasks vs 32% for vector RAG; vector accuracy drops to 0% on 10+ entity queries
Graph RAG hurts on simple semantic search — vector is better there
Microsoft GraphRAG: Leiden clustering + LLM community summaries = global synthesis capability
LazyGraphRAG (June 2025) reduces indexing cost from $20–500 to under $5 by deferring summaries to query time
The hybrid pattern (vector search → graph traversal → LLM generation) is best for most production systems