AI Documentation Standards
A model card documents the model. A datasheet documents the dataset. A system card documents the deployed AI system as a whole — including its context, integrations, and societal implications. Together these form the AI documentation stack that regulators, auditors, procurement teams, and downstream developers need. This page covers the main documentation artefacts, who creates them, and what regulators now require.
Datasheets for Datasets
Datasheets for Datasets (Gebru et al., 2018) applies the concept of hardware datasheets to machine learning datasets. The goal is to provide enough information for a practitioner to decide whether a dataset is appropriate for their use case and to understand the potential risks of using it.
| Section | Key questions answered |
|---|---|
| Motivation | Why was the dataset created? Who funded or commissioned it? What was the intended purpose? |
| Composition | What instances are in the dataset? How many? What format? Are there labels? Any missing data? |
| Collection process | How was data collected? Over what time period? From which sources? Who collected it? |
| Preprocessing | Was the data cleaned, tokenised, filtered, or transformed? Is raw data available? |
| Uses | What tasks is this dataset appropriate for? What uses would be inappropriate? |
| Distribution | Is the dataset publicly available? Under what licence? Any export controls or restrictions? |
| Maintenance | Who maintains it? How will errors be reported and fixed? What is the update cadence? |
System Cards
A system card documents the deployed AI system as a whole — not just the model. It captures the end-to-end system: what models are used, how they are orchestrated, what safety measures are applied, and what societal implications have been considered. Meta popularised system cards for their AI products (e.g., Galactica, Llama 2 system cards).
System card content (beyond model card)
- System architecture: how models, APIs, databases, and human review steps combine
- Safety measures: content filters, guardrails, rate limits, abuse monitoring
- User population and access controls
- Red team findings and how they were addressed
- Societal impact assessment: potential for misuse at scale
- Feedback mechanisms for users to report harms
When a system card is needed
- Publicly deployed AI products or APIs
- High-risk use cases in regulated sectors (healthcare, finance, HR, law enforcement)
- Enterprise AI deployments requiring governance documentation
- Research model releases where downstream misuse is a known risk
EU AI Act Documentation Obligations
The EU AI Act (2024) introduces mandatory documentation requirements for AI systems. The obligation level depends on risk classification:
High-risk AI systems (Annex III) — Technical Documentation
- General description of the AI system including its intended purpose
- Description of system components: algorithms, training methodology, training data characteristics
- Information about training, validation, and testing datasets — including their provenance and curation methodology
- Design specifications for data governance — how training data was managed and what quality criteria applied
- Description of monitoring, functioning, and control measures
- Post-market monitoring plan and serious incident reporting procedure
- Detailed description of performance and accuracy metrics, including disaggregated results
- Known or foreseeable risks and risk mitigation measures taken
General Purpose AI (GPAI) models — Transparency documentation
- Technical documentation for the model (capabilities, limitations, training approach)
- Summary of training data (content types, geographical scope, languages)
- For systemic risk GPAI (≥10²⁵ FLOPs training compute): adversarial testing results, cybersecurity incident reporting obligations
Internal vs External Documentation
| Audience | Document type | Key content |
|---|---|---|
| Internal developers | Internal model documentation, training runbook, evaluation report | Architecture details, hyperparameters, full evaluation results including failures, known issues |
| Risk and compliance | Risk assessment, model risk management report | Risk identification and treatment, regulatory obligations, residual risks accepted |
| Downstream developers | Public model card, API documentation | Intended use, performance characteristics, known limitations, licence, usage restrictions |
| Regulators / auditors | Technical documentation (EU AI Act format), audit trail | Full documentation stack; decisions made during development; evidence of conformity assessment |
| Affected individuals | Plain-language explanation, transparency notice | What decision the AI informed; what factors mattered; how to seek review or redress |
Living Documentation: Keeping Records Current
AI documentation becomes misleading if it describes a model version that is no longer deployed. Keeping documentation current requires process, not just intent:
- Documentation triggers: Define which events require a documentation update — model retrain, new evaluation findings, new deployment context, regulatory change, discovered failure mode
- Model registry integration: Link documentation directly to the model registry entry so that every model version has an associated, versioned documentation artifact
- Review cadence: Schedule annual documentation reviews even when no model changes occur — the external context (regulation, known harms) evolves independently of the model
- Ownership: Assign a named owner per system documentation set — typically the model owner or responsible AI lead. Without named ownership, documentation goes stale.
- Audit trail: Maintain a log of what changed and when — regulators may request evidence that documentation was current at the time of a specific deployment decision
Checklist: Do You Understand This?
- What is the difference between a datasheet for a dataset and a model card?
- What additional content does a system card include beyond a model card?
- Under the EU AI Act, what documentation is required for a high-risk AI system vs a GPAI model?
- Name three different audiences for AI documentation and what each audience primarily needs.
- What events should trigger a documentation update, and who should own that responsibility?