🧠 All Things AI
Intermediate

AI Documentation Standards

A model card documents the model. A datasheet documents the dataset. A system card documents the deployed AI system as a whole — including its context, integrations, and societal implications. Together these form the AI documentation stack that regulators, auditors, procurement teams, and downstream developers need. This page covers the main documentation artefacts, who creates them, and what regulators now require.

Datasheets for Datasets

Datasheets for Datasets (Gebru et al., 2018) applies the concept of hardware datasheets to machine learning datasets. The goal is to provide enough information for a practitioner to decide whether a dataset is appropriate for their use case and to understand the potential risks of using it.

SectionKey questions answered
MotivationWhy was the dataset created? Who funded or commissioned it? What was the intended purpose?
CompositionWhat instances are in the dataset? How many? What format? Are there labels? Any missing data?
Collection processHow was data collected? Over what time period? From which sources? Who collected it?
PreprocessingWas the data cleaned, tokenised, filtered, or transformed? Is raw data available?
UsesWhat tasks is this dataset appropriate for? What uses would be inappropriate?
DistributionIs the dataset publicly available? Under what licence? Any export controls or restrictions?
MaintenanceWho maintains it? How will errors be reported and fixed? What is the update cadence?

System Cards

A system card documents the deployed AI system as a whole — not just the model. It captures the end-to-end system: what models are used, how they are orchestrated, what safety measures are applied, and what societal implications have been considered. Meta popularised system cards for their AI products (e.g., Galactica, Llama 2 system cards).

System card content (beyond model card)

  • System architecture: how models, APIs, databases, and human review steps combine
  • Safety measures: content filters, guardrails, rate limits, abuse monitoring
  • User population and access controls
  • Red team findings and how they were addressed
  • Societal impact assessment: potential for misuse at scale
  • Feedback mechanisms for users to report harms

When a system card is needed

  • Publicly deployed AI products or APIs
  • High-risk use cases in regulated sectors (healthcare, finance, HR, law enforcement)
  • Enterprise AI deployments requiring governance documentation
  • Research model releases where downstream misuse is a known risk

EU AI Act Documentation Obligations

The EU AI Act (2024) introduces mandatory documentation requirements for AI systems. The obligation level depends on risk classification:

High-risk AI systems (Annex III) — Technical Documentation

  • General description of the AI system including its intended purpose
  • Description of system components: algorithms, training methodology, training data characteristics
  • Information about training, validation, and testing datasets — including their provenance and curation methodology
  • Design specifications for data governance — how training data was managed and what quality criteria applied
  • Description of monitoring, functioning, and control measures
  • Post-market monitoring plan and serious incident reporting procedure
  • Detailed description of performance and accuracy metrics, including disaggregated results
  • Known or foreseeable risks and risk mitigation measures taken

General Purpose AI (GPAI) models — Transparency documentation

  • Technical documentation for the model (capabilities, limitations, training approach)
  • Summary of training data (content types, geographical scope, languages)
  • For systemic risk GPAI (≥10²⁵ FLOPs training compute): adversarial testing results, cybersecurity incident reporting obligations

Internal vs External Documentation

AudienceDocument typeKey content
Internal developersInternal model documentation, training runbook, evaluation reportArchitecture details, hyperparameters, full evaluation results including failures, known issues
Risk and complianceRisk assessment, model risk management reportRisk identification and treatment, regulatory obligations, residual risks accepted
Downstream developersPublic model card, API documentationIntended use, performance characteristics, known limitations, licence, usage restrictions
Regulators / auditorsTechnical documentation (EU AI Act format), audit trailFull documentation stack; decisions made during development; evidence of conformity assessment
Affected individualsPlain-language explanation, transparency noticeWhat decision the AI informed; what factors mattered; how to seek review or redress

Living Documentation: Keeping Records Current

AI documentation becomes misleading if it describes a model version that is no longer deployed. Keeping documentation current requires process, not just intent:

  • Documentation triggers: Define which events require a documentation update — model retrain, new evaluation findings, new deployment context, regulatory change, discovered failure mode
  • Model registry integration: Link documentation directly to the model registry entry so that every model version has an associated, versioned documentation artifact
  • Review cadence: Schedule annual documentation reviews even when no model changes occur — the external context (regulation, known harms) evolves independently of the model
  • Ownership: Assign a named owner per system documentation set — typically the model owner or responsible AI lead. Without named ownership, documentation goes stale.
  • Audit trail: Maintain a log of what changed and when — regulators may request evidence that documentation was current at the time of a specific deployment decision

Checklist: Do You Understand This?

  • What is the difference between a datasheet for a dataset and a model card?
  • What additional content does a system card include beyond a model card?
  • Under the EU AI Act, what documentation is required for a high-risk AI system vs a GPAI model?
  • Name three different audiences for AI documentation and what each audience primarily needs.
  • What events should trigger a documentation update, and who should own that responsibility?