Intermediate

Graph RAG vs Vector RAG

Vector RAG finds similar text. Graph RAG follows connections. For simple lookups, vector search is faster, cheaper, and sufficient. For questions that require multi-hop reasoning, global synthesis, or relationship-based answers, vector search can't compete β€” and in some cases fails entirely. Knowing which to use (and when to combine them) is a core architectural decision.

The RAG Complexity Spectrum

Vector RAG
Similarity search β€” fast, cheap, good for simple lookups
Graph RAG
Relationship traversal β€” powerful for complex reasoning
Find docs about topic X
Summarise recent updates
Who worked on projects using library Y?
What are the main themes across 10,000 docs?

How They Differ

Vector RAG

  • Embeds documents as vectors; retrieves by cosine similarity
  • Finds text that looks like the query β€” lexical and semantic proximity
  • Each chunk is independent β€” no explicit relationships between items
  • Simple to set up; fast at query time; low indexing cost
  • Degrades on multi-hop: β€œwho is the CEO's manager's direct report?” requires two hops that a single vector query can't traverse

Graph RAG

  • Stores entities and relationships in a graph; retrieves by traversal
  • Finds answers by following connections β€” relationship chains and patterns
  • Relationships are explicit β€” typed edges with direction and properties
  • Higher indexing cost (LLM extraction pipeline required)
  • Multi-hop queries are native: any depth, any relationship pattern

Performance Comparison

The performance gap between the two approaches is not marginal on complex tasks:

Task TypeVector RAGGraph RAG
Simple fact retrievalGood β€” fast, sufficientOverkill β€” adds latency for no gain
Semantic search (β€œfind docs about X”)ExcellentCan hurt precision (adds tangential context)
Multi-hop reasoning (2+ hops)32% accuracy86% accuracy
Queries involving 10+ entitiesAccuracy β†’ 0%Above 70%
Global synthesis (β€œmain themes across corpus”)Cannot do thisNative (community summaries)
Schema-bound aggregation queries0% accuracy90% accuracy

Accuracy figures from Microsoft Research GraphRAG benchmarks (2024–2025). Graph RAG wins 70–80% of complex sensemaking tasks in head-to-head evaluations.

Microsoft GraphRAG Architecture

Microsoft Research released GraphRAG in 2024 to address the limitations of naive vector RAG on large corpora. Its key innovation is hierarchical community detection:

  1. Extract β€” LLM extracts all entities and relationships from the corpus (expensive, done at indexing time)
  2. Cluster β€” Leiden algorithm groups related entities into hierarchical communities
  3. Summarise β€” LLM generates a summary for each community at multiple levels of the hierarchy
  4. Query β€” Local search finds specific entities; global search traverses community summaries

The community summaries enable questions that no vector index can answer: β€œWhat are the major recurring themes across all 50,000 support tickets?” β€” because the answer isn't in any single document; it's in the aggregated structure.

Cost Tradeoffs

Full GraphRAG indexing cost

The original Microsoft GraphRAG approach generates LLM summaries for every community at indexing time. For a typical enterprise corpus this costs $20–$500in LLM API calls β€” and must be repeated whenever new data is added.

LazyGraphRAG (June 2025)

Microsoft's LazyGraphRAG variant defers community summarisation to query time, reducing indexing cost to under $5. The tradeoff: 2–8 additional seconds per query as summaries are generated on demand.

The Hybrid Pattern

The practical architecture for most production systems is neither pure vector RAG nor pure graph RAG β€” it's both, used in sequence:

1. Vector search

Find the most relevant starting nodes. Fast semantic similarity narrows a corpus of millions to tens of relevant entities.

2. Graph traversal

From the starting nodes, follow edges to collect related context β€” dependencies, relationships, co-references β€” that pure similarity search would miss.

3. LLM generation

Pass the combined context (vector results + graph traversal results) to the LLM. The model gets both semantic similarity and structured relationship context.

LlamaIndex's PropertyGraph index is the most common implementation: it stores vector embeddings alongside graph triples, enabling this hybrid retrieval pattern from a single index.

Decision Guide: Which to Use

Use vector RAG when

  • Questions are about specific topics or documents (β€œfind me the docs about authentication”)
  • You need fast, cheap, scalable retrieval
  • Your data doesn't have strong relational structure
  • You're prototyping or the task is simple enough that vector similarity is sufficient

Use graph RAG when

  • Questions require multi-hop reasoning (β€œwhat X is connected to Y via Z?”)
  • You need global synthesis across the entire corpus
  • Relationships between entities are part of the answer, not incidental
  • Your domain has strong relational structure: healthcare pathways, fraud networks, codebases, org charts

Checklist: Do You Understand This?

  • Vector RAG finds similar text; Graph RAG follows typed relationship edges between entities
  • Graph RAG: 86% accuracy on multi-hop tasks vs 32% for vector RAG; vector accuracy drops to 0% on 10+ entity queries
  • Graph RAG hurts on simple semantic search β€” vector is better there
  • Microsoft GraphRAG: Leiden clustering + LLM community summaries = global synthesis capability
  • LazyGraphRAG (June 2025) reduces indexing cost from $20–500 to under $5 by deferring summaries to query time
  • The hybrid pattern (vector search β†’ graph traversal β†’ LLM generation) is best for most production systems

Page built: 01 Jun 2026