Beginner

What Are Knowledge Graphs

A knowledge graph represents entities (people, concepts, objects, places) as nodes and the relationships between them as edges. Unlike a relational database — where information lives in rigid tables joined by foreign keys — a knowledge graph treats relationships as first-class citizens, making it natural to traverse connections and ask questions that span multiple hops.

The Three Building Blocks

Nodes (Entities)

The things in your graph. A node can represent a person, organisation, product, concept, document, function, or any domain object. Each node has a label (its type) and optional properties.

Person: Alice, Company: Acme Corp, File: auth.ts

Edges (Relationships)

The connections between nodes. Edges are directional and typed — WORKS_AT, CALLS, WROTE, MANAGES. The relationship type carries meaning that a foreign key in SQL cannot express.

Alice → WORKS_AT → Acme Corp

Properties

Metadata attached to both nodes and edges. A node might carry a birthdate or confidence score; an edge might carry a start date or a weight. Properties add context without creating new nodes.

since: 2023, confidence: 0.94

Why Not a Relational Database?

Relational databases are excellent for structured, tabular data with predictable query patterns. They struggle when:

Where SQL struggles

Variable-depth traversals (friends of friends of friends)
Evolving schemas — adding new relationship types means schema migrations
Multi-hop queries require expensive JOINs that degrade with graph depth
Relationship attributes require junction tables that obscure intent

Where graphs excel

Arbitrary-depth traversals are native — no JOIN explosion
Schema-flexible — new node types and edge types added without migration
Relationship types are first-class — queryable, typed, and weighted
Pattern matching across the graph: “find all A→B→C chains where B has property X”

Graph Architecture

Query Layer

Cypher / SPARQL queries

LLM natural language → query

Graph Store

Nodes + properties

Typed edges + properties

Vector index (optional)

Ingestion

Entity extraction

Relationship extraction

Entity resolution

Source Data

Documents / PDFs

Code / schemas

APIs / databases

Knowledge graph layers: source data flows up through extraction into a graph store, queried by LLMs or Cypher

Ontologies and Schemas

A knowledge graph can be schema-free (any node type, any edge type) or ontology-constrained (a formal specification of allowed types and relationships). For AI applications, schema-free construction with LLM extraction is common — you let the model discover the structure. For enterprise or regulated domains, an ontology enforces semantic consistency across the graph.

The practical middle ground: define a light ontology — a small set of node types and relationship types that matter for your use case — and guide LLM extraction toward that schema using few-shot examples or structured output constraints.

Key Graph Databases

Database	Query Language	Strengths	Best For
Neo4j	Cypher	Industry standard, native vector index, LLM Graph Builder tool, free tier (AuraDB)	Most AI/LLM workloads, prototyping, production
FalkorDB	Cypher	Optimised specifically for GraphRAG workloads, low latency	GraphRAG production deployments
NebulaGraph	nGQL	Distributed, high performance, Apache 2.0 open-source	Very large graphs at scale
Memgraph	Cypher	In-memory, streaming graph analytics, compatible with Neo4j drivers	Real-time graphs, low-latency queries

What Problems Do Knowledge Graphs Solve

Enterprise knowledge bases

Connecting people, documents, projects, and skills across an organisation — enabling search that follows relationships, not just keywords.

Codebase understanding

Mapping function calls, file dependencies, database schemas, and documentation into a queryable graph — dramatically reducing tokens needed per AI query.

Fraud detection and compliance

Following transaction chains, relationship networks, and entity linkages that are invisible in tabular data.

Research and literature graphs

Linking papers, authors, concepts, findings, and citations so an LLM can synthesise across an entire body of work.

Checklist: Do You Understand This?

A knowledge graph has three primitives: nodes (entities), edges (typed relationships), and properties (metadata on both)
Unlike SQL, relationships are first-class — directional, typed, and directly traversable without JOINs
Multi-hop queries (“find all X connected to Y via Z”) are where graphs outperform relational databases
Neo4j with Cypher is the industry-standard starting point for AI and LLM graph workloads
A light ontology (defined node/edge types) improves LLM extraction quality and graph consistency