AI Incident Lifecycle
An AI incident is any event in which an AI system causes, contributes to, or is implicated in unexpected harm, failure, or near-miss — to users, third parties, the organisation, or society. AI incidents differ from conventional software incidents in important ways: causes may be statistical rather than deterministic, failures may be gradual rather than sudden, and the harm may not be immediately visible in system logs. A structured incident lifecycle ensures that these differences are handled appropriately.
What Counts as an AI Incident
Clear incidents
- Model produces output that directly causes physical, financial, or psychological harm to a specific individual
- AI system makes a high-consequential decision affecting a group based on a protected characteristic
- Security breach via AI attack vector (prompt injection, model extraction)
- Regulatory violation triggered by AI output (privacy breach, prohibited use)
Near-misses and borderline cases
- Model drift detected before it affects decisions
- Biased output pattern discovered in review that had not yet been reported by users
- User complaint about AI behaviour that, if widespread, would constitute harm
- AI system used for out-of-scope purpose without causing immediate harm
Near-misses should be recorded in the incident register and investigated — they are early warning signals for actual incidents.
Severity Classification
| Severity | Definition | Response time | Escalation |
|---|---|---|---|
| P1 — Critical | Active harm to individuals; regulatory breach in progress; legal exposure; reputational crisis | Immediate — within 1 hour | CISO, Legal, Executive team, Board if material |
| P2 — High | Potential harm to individuals; significant bias discovered; regulatory reporting threshold may be triggered | 4 hours | AI Risk Owner, Legal, affected product leads |
| P3 — Medium | Degraded model performance; user complaints about AI outputs; near-miss with potential for escalation | 24 hours | Model owner, product team |
| P4 — Low | Minor quality issue; isolated incorrect output; low-harm near-miss | 5 business days | Model owner |
The AI Incident Lifecycle
Incident is identified — by monitoring alert, user report, internal audit, or regulatory notification. Initial severity assessment made within the first hour. Incident ticket opened; responders assigned.
Immediate action to stop or limit ongoing harm. May involve: routing traffic away from the affected model, disabling a specific feature, reverting to a previous model version, or adding a guardrail to block the problematic output pattern. Containment is fast and may be imperfect — it is about limiting harm, not fixing the root cause.
Root cause analysis of what happened and why. AI-specific investigation: examine model inputs and outputs around the incident; check for data distribution shifts; review recent model changes; assess whether the failure is a one-off or systematic. See root cause categories below.
Fix the root cause. Options: model rollback, retraining with corrected data, updated guardrails, additional monitoring, policy changes, user communication. Remediation must be tested before re-deployment — not deployed directly to production.
Incident is formally closed when: the fix is deployed and verified, regulatory obligations are met, post-incident review is complete, and the incident register is updated. Closure requires sign-off from the AI risk owner — it is not automatic.
Containment Decision: Shut Down vs Mitigate
When to shut down the model
- Active, ongoing harm to identifiable individuals that cannot be stopped with a targeted fix
- Regulatory obligation to suspend — e.g., EU AI Act suspension order by NCA
- Security compromise with unknown blast radius (model may have been manipulated)
- No alternative path to contain the harm within the required response time
When to apply targeted mitigation
- Harm is isolated to a specific input pattern that can be blocked by a guardrail
- Rollback to a previous model version is available and restores safe behaviour
- Traffic can be routed away from the affected feature while the rest of the system continues
- Risk of shutdown (loss of critical service, patient harm from absence of AI-assisted triage) exceeds risk of continuing with mitigations
Root Cause Analysis for AI Incidents
AI incidents require RCA frameworks that account for the statistical and emergent nature of model failures. Common root cause categories:
- Training data failure: Training data did not represent the input distribution encountered in production; mislabelled or poisoned training data; data used outside its collection context
- Distribution shift: Real-world input distribution has drifted from the training distribution; seasonal or temporal shift not accounted for in monitoring
- Adversarial manipulation: Deliberate crafting of inputs to cause model failure — prompt injection, adversarial examples, data poisoning of the retraining pipeline
- Scope violation: Model used for a task or population it was not validated for — either by users or by feature expansion without re-validation
- Monitoring failure: The failure mode existed before detection but monitoring did not surface it — because metrics were not disaggregated, alerts were set too broadly, or the failure was gradual
- Human override failure: Human review step that should have caught the AI error failed — over-reliance on AI, inadequate training, or review step bypassed
Checklist: Do You Understand This?
- What distinguishes an AI incident from a near-miss, and why should near-misses be recorded?
- At P1 severity, what is the required response time and who must be escalated to?
- What are the five phases of the AI incident lifecycle in order?
- Under what conditions should a model be shut down rather than mitigated?
- Name four root cause categories specific to AI incidents (not generic to software incidents).
- What must happen before an incident can be formally closed?