🧠 All Things AI
Beginner

AI vs ML vs Deep Learning vs GenAI

These four terms get thrown around interchangeably, but they are not the same thing. They form a nested hierarchy — each one is a subset of the one above it. Understanding this hierarchy is the single most important mental model for everything else on this site.

The Hierarchy at a Glance

Think of it as concentric circles — each inner ring is a specialised subset of the one containing it:

Artificial Intelligence
Any system mimicking human intelligence — rules, search, or learning
Machine Learning
AI that learns patterns from data instead of following hand-written rules
Deep Learning
ML using multi-layer neural networks to learn hierarchical representations
Generative AI
Deep learning that creates new content — text, images, audio, code, video

Every GenAI system is deep learning. Every deep learning system is ML. Every ML system is AI. The reverse is not true.

Every generative AI system is a deep learning system. Every deep learning system is a machine learning system. And every machine learning system is an AI system. But the reverse is not true — there are AI systems that have nothing to do with machine learning.

Artificial Intelligence (AI)

Artificial Intelligence is the broadest term. It refers to any system that performs tasks that would normally require human intelligence — understanding language, recognizing images, making decisions, planning, or reasoning.

AI includes many approaches, spanning seven decades of research:

1950s–70s
Rule-based systems
If-then-else logic, expert systems
Search algorithms
Chess engines, planning
1980s–90s
Expert systems
Medicine, engineering knowledge bases
Early neural nets
Perceptrons, backprop discovered
2000s–10s
Classic ML
SVMs, random forests, gradient boosting
Deep learning
CNNs, ImageNet breakthrough (2012)
2017–now
Transformers
Attention is All You Need
Generative AI
LLMs, diffusion models, multimodal

AI approaches through history — each era built on the last

The key insight: AI is a goal (make machines intelligent), not a specific technology. The technology used to achieve that goal has changed dramatically over 70 years.

Machine Learning (ML)

Machine Learning is the subset of AI where systems learn from data rather than following hand-written rules. Instead of a programmer writing "if the email contains X, mark as spam," you give the system thousands of examples of spam and not-spam emails, and it figures out the patterns itself.

The Three Types of ML

Supervised
Labeled examples
Input → correct output
Use cases
Email spam, price prediction, diagnosis
Unsupervised
No labels
Find hidden structure in data
Use cases
Customer segments, anomaly detection
Reinforcement
Trial & error
Rewards and penalties shape behaviour
Use cases
AlphaGo, robotics, RLHF for LLMs

The three ML paradigms — supervised is the most common in production

Classic ML Algorithms

Before deep learning dominated, these algorithms powered most ML applications:

  • Linear/logistic regression — Simple, interpretable, still widely used
  • Decision trees & random forests — Great for tabular data
  • Support vector machines (SVMs) — Effective for classification with clear margins
  • k-nearest neighbors — Simple distance-based classification
  • Gradient boosting (XGBoost, LightGBM) — Still the gold standard for structured/tabular data in 2025

Important: for tabular data (spreadsheets, databases, CSV files), classic ML often still outperforms deep learning. Not everything needs a neural network.

Deep Learning (DL)

Deep Learning is the subset of ML that uses neural networks with many layers (hence "deep"). These layers learn increasingly abstract representations of the input data — from raw pixels to edges to shapes to objects, or from raw text to syntax to semantics to meaning.

Why "Deep" Works

A single-layer neural network (perceptron) can only learn simple linear patterns. Stacking many layers lets the network learn complex, hierarchical patterns that would be impossible to hand-code. The "depth" refers to the number of layers — modern models have hundreds or thousands.

What Made Deep Learning Take Off

Deep learning existed since the 1980s but only became practical around 2012 due to four converging factors:

Data
ImageNet, Common Crawl — massive labeled datasets from the internet
Compute
GPUs (made for gaming) turned out perfect for neural network math
Algorithms
Dropout, batch norm, better optimizers made training stable
Breakthrough
AlexNet 2012: 10.8% error gap vs 2nd place on ImageNet

All four factors converged around 2012 — removing any one would have delayed the revolution

Key Deep Learning Architectures

Vision
CNNs
2012–2020, image classification, object detection
ViT
Vision Transformer, 2020+
Language
RNN / LSTM
Sequential, struggled with long range
Transformer
2017 — attention processes all tokens at once
Generation
GANs
Generator vs discriminator, image synthesis
Diffusion
Stable Diffusion, DALL-E, Midjourney
Autoregressive
GPT-style token-by-token text generation

Architecture families by domain — transformers now dominate all three

Generative AI (GenAI)

Generative AI is the subset of deep learning focused on creating new content — text, images, audio, video, code, music, 3D models. Unlike earlier AI that classified or predicted, generative AI produces novel outputs that didn't exist before.

Generative AI by Modality

Text
LLMs
GPT-4, Claude, Gemini, LLaMA, Mistral
Code
Claude Code, Copilot, Cursor
Images
Diffusion
Stable Diffusion, DALL-E 3, Flux
Midjourney
Proprietary image gen
Audio
TTS
ElevenLabs, OpenAI TTS, Piper
Music
Suno, Udio
Video
Sora
OpenAI, text-to-video
Runway Gen-3
Video generation & editing
Kling
Kuaishou video gen

GenAI modalities in 2025 — text is most mature; video is the frontier

The Transformer Revolution

The 2017 paper "Attention Is All You Need" introduced the transformer architecture, which became the foundation of modern generative AI. Key innovations:

  • Self-attention — Each part of the input can attend to every other part, capturing long-range relationships
  • Parallelization — Unlike RNNs, transformers process entire sequences at once, making training much faster on GPUs
  • Scaling — Transformers improve predictably as you add more data, more parameters, and more compute (scaling laws)

This scalability is why the AI industry is investing billions in compute infrastructure — bigger transformers, trained on more data, consistently produce smarter models.

Common Points of Confusion

"Is ChatGPT artificial intelligence?"

Yes — it's all four at once. ChatGPT is an AI system, built with machine learning, specifically deep learning, specifically generative AI (an LLM). It sits in the innermost circle.

"Is a spam filter AI?"

Yes, if it uses ML (most modern ones do). A rule-based spam filter is also AI in the broadest sense, but it's not ML or deep learning.

"Is all AI generative?"

No. Most AI in production today is not generative — fraud detection, recommendation engines, search ranking, self-driving perception. Generative AI is the newest and most visible category, but it's a small slice of the total AI landscape.

"Do I need to understand ML math to use GenAI?"

No. You can be highly effective using LLMs, building RAG systems, and deploying AI products without understanding backpropagation or gradient descent. But knowing the conceptual hierarchy helps you make better decisions about what tools to use and when.

Practical Takeaway

When someone says "AI," ask yourself which layer they mean:

  • If they mean any intelligent automation → they mean AI broadly
  • If they mean learning from data → they mean ML
  • If they mean neural networks → they mean deep learning
  • If they mean creating text, images, or code → they mean generative AI

This distinction matters because it determines what skills you need, what tools to use, and what limitations to expect. A generative AI solution has very different failure modes than a classic ML classification model.

Checklist: Do You Understand This?

  • Can you explain the four terms to someone in 30 seconds?
  • Can you give an example of AI that is not machine learning?
  • Can you name the architecture behind modern LLMs?
  • Can you explain why deep learning took off around 2012?
  • Can you name three types of generative AI besides text?