Scaling

The laws governing how model capability grows with compute, data, and parameters — and what happens at the frontier.

In This Section

Scaling Laws — Compute, Data, Parameters

Kaplan et al. power laws, FLOP budgeting, and predicting performance before training.

Chinchilla & Optimal Training

Compute-optimal training, the 20 tokens/parameter rule, and deliberate overtraining.

Emergent Abilities & Phase Transitions

Capabilities that appear suddenly at scale and why they are hard to predict.