Graph RAG vs Vector RAG
Vector RAG finds similar text. Graph RAG follows connections. For simple lookups, vector search is faster, cheaper, and sufficient. For questions that require multi-hop reasoning, global synthesis, or relationship-based answers, vector search can't compete β and in some cases fails entirely. Knowing which to use (and when to combine them) is a core architectural decision.
The RAG Complexity Spectrum
How They Differ
Vector RAG
- Embeds documents as vectors; retrieves by cosine similarity
- Finds text that looks like the query β lexical and semantic proximity
- Each chunk is independent β no explicit relationships between items
- Simple to set up; fast at query time; low indexing cost
- Degrades on multi-hop: βwho is the CEO's manager's direct report?β requires two hops that a single vector query can't traverse
Graph RAG
- Stores entities and relationships in a graph; retrieves by traversal
- Finds answers by following connections β relationship chains and patterns
- Relationships are explicit β typed edges with direction and properties
- Higher indexing cost (LLM extraction pipeline required)
- Multi-hop queries are native: any depth, any relationship pattern
Performance Comparison
The performance gap between the two approaches is not marginal on complex tasks:
| Task Type | Vector RAG | Graph RAG |
|---|---|---|
| Simple fact retrieval | Good β fast, sufficient | Overkill β adds latency for no gain |
| Semantic search (βfind docs about Xβ) | Excellent | Can hurt precision (adds tangential context) |
| Multi-hop reasoning (2+ hops) | 32% accuracy | 86% accuracy |
| Queries involving 10+ entities | Accuracy β 0% | Above 70% |
| Global synthesis (βmain themes across corpusβ) | Cannot do this | Native (community summaries) |
| Schema-bound aggregation queries | 0% accuracy | 90% accuracy |
Accuracy figures from Microsoft Research GraphRAG benchmarks (2024β2025). Graph RAG wins 70β80% of complex sensemaking tasks in head-to-head evaluations.
Microsoft GraphRAG Architecture
Microsoft Research released GraphRAG in 2024 to address the limitations of naive vector RAG on large corpora. Its key innovation is hierarchical community detection:
- Extract β LLM extracts all entities and relationships from the corpus (expensive, done at indexing time)
- Cluster β Leiden algorithm groups related entities into hierarchical communities
- Summarise β LLM generates a summary for each community at multiple levels of the hierarchy
- Query β Local search finds specific entities; global search traverses community summaries
The community summaries enable questions that no vector index can answer: βWhat are the major recurring themes across all 50,000 support tickets?β β because the answer isn't in any single document; it's in the aggregated structure.
Cost Tradeoffs
Full GraphRAG indexing cost
The original Microsoft GraphRAG approach generates LLM summaries for every community at indexing time. For a typical enterprise corpus this costs $20β$500in LLM API calls β and must be repeated whenever new data is added.
LazyGraphRAG (June 2025)
Microsoft's LazyGraphRAG variant defers community summarisation to query time, reducing indexing cost to under $5. The tradeoff: 2β8 additional seconds per query as summaries are generated on demand.
The Hybrid Pattern
The practical architecture for most production systems is neither pure vector RAG nor pure graph RAG β it's both, used in sequence:
1. Vector search
Find the most relevant starting nodes. Fast semantic similarity narrows a corpus of millions to tens of relevant entities.
2. Graph traversal
From the starting nodes, follow edges to collect related context β dependencies, relationships, co-references β that pure similarity search would miss.
3. LLM generation
Pass the combined context (vector results + graph traversal results) to the LLM. The model gets both semantic similarity and structured relationship context.
LlamaIndex's PropertyGraph index is the most common implementation: it stores vector embeddings alongside graph triples, enabling this hybrid retrieval pattern from a single index.
Decision Guide: Which to Use
Use vector RAG when
- Questions are about specific topics or documents (βfind me the docs about authenticationβ)
- You need fast, cheap, scalable retrieval
- Your data doesn't have strong relational structure
- You're prototyping or the task is simple enough that vector similarity is sufficient
Use graph RAG when
- Questions require multi-hop reasoning (βwhat X is connected to Y via Z?β)
- You need global synthesis across the entire corpus
- Relationships between entities are part of the answer, not incidental
- Your domain has strong relational structure: healthcare pathways, fraud networks, codebases, org charts
Checklist: Do You Understand This?
- Vector RAG finds similar text; Graph RAG follows typed relationship edges between entities
- Graph RAG: 86% accuracy on multi-hop tasks vs 32% for vector RAG; vector accuracy drops to 0% on 10+ entity queries
- Graph RAG hurts on simple semantic search β vector is better there
- Microsoft GraphRAG: Leiden clustering + LLM community summaries = global synthesis capability
- LazyGraphRAG (June 2025) reduces indexing cost from $20β500 to under $5 by deferring summaries to query time
- The hybrid pattern (vector search β graph traversal β LLM generation) is best for most production systems