Advanced

AI Incident Lifecycle

An AI incident is any event in which an AI system causes, contributes to, or is implicated in unexpected harm, failure, or near-miss — to users, third parties, the organisation, or society. AI incidents differ from conventional software incidents in important ways: causes may be statistical rather than deterministic, failures may be gradual rather than sudden, and the harm may not be immediately visible in system logs. A structured incident lifecycle ensures that these differences are handled appropriately.

What Counts as an AI Incident

Clear incidents

Model produces output that directly causes physical, financial, or psychological harm to a specific individual
AI system makes a high-consequential decision affecting a group based on a protected characteristic
Security breach via AI attack vector (prompt injection, model extraction)
Regulatory violation triggered by AI output (privacy breach, prohibited use)

Near-misses and borderline cases

Model drift detected before it affects decisions
Biased output pattern discovered in review that had not yet been reported by users
User complaint about AI behaviour that, if widespread, would constitute harm
AI system used for out-of-scope purpose without causing immediate harm

Near-misses should be recorded in the incident register and investigated — they are early warning signals for actual incidents.

Severity Classification

Severity	Definition	Response time	Escalation
P1 — Critical	Active harm to individuals; regulatory breach in progress; legal exposure; reputational crisis	Immediate — within 1 hour	CISO, Legal, Executive team, Board if material
P2 — High	Potential harm to individuals; significant bias discovered; regulatory reporting threshold may be triggered	4 hours	AI Risk Owner, Legal, affected product leads
P3 — Medium	Degraded model performance; user complaints about AI outputs; near-miss with potential for escalation	24 hours	Model owner, product team
P4 — Low	Minor quality issue; isolated incorrect output; low-harm near-miss	5 business days	Model owner

The AI Incident Lifecycle

Detection

Incident is identified — by monitoring alert, user report, internal audit, or regulatory notification. Initial severity assessment made within the first hour. Incident ticket opened; responders assigned.

Containment

Immediate action to stop or limit ongoing harm. May involve: routing traffic away from the affected model, disabling a specific feature, reverting to a previous model version, or adding a guardrail to block the problematic output pattern. Containment is fast and may be imperfect — it is about limiting harm, not fixing the root cause.

Investigation

Root cause analysis of what happened and why. AI-specific investigation: examine model inputs and outputs around the incident; check for data distribution shifts; review recent model changes; assess whether the failure is a one-off or systematic. See root cause categories below.

Remediation

Fix the root cause. Options: model rollback, retraining with corrected data, updated guardrails, additional monitoring, policy changes, user communication. Remediation must be tested before re-deployment — not deployed directly to production.

Closure

Incident is formally closed when: the fix is deployed and verified, regulatory obligations are met, post-incident review is complete, and the incident register is updated. Closure requires sign-off from the AI risk owner — it is not automatic.

Containment Decision: Shut Down vs Mitigate

When to shut down the model

Active, ongoing harm to identifiable individuals that cannot be stopped with a targeted fix
Regulatory obligation to suspend — e.g., EU AI Act suspension order by NCA
Security compromise with unknown blast radius (model may have been manipulated)
No alternative path to contain the harm within the required response time

When to apply targeted mitigation

Harm is isolated to a specific input pattern that can be blocked by a guardrail
Rollback to a previous model version is available and restores safe behaviour
Traffic can be routed away from the affected feature while the rest of the system continues
Risk of shutdown (loss of critical service, patient harm from absence of AI-assisted triage) exceeds risk of continuing with mitigations

Root Cause Analysis for AI Incidents

AI incidents require RCA frameworks that account for the statistical and emergent nature of model failures. Common root cause categories:

Training data failure: Training data did not represent the input distribution encountered in production; mislabelled or poisoned training data; data used outside its collection context
Distribution shift: Real-world input distribution has drifted from the training distribution; seasonal or temporal shift not accounted for in monitoring
Adversarial manipulation: Deliberate crafting of inputs to cause model failure — prompt injection, adversarial examples, data poisoning of the retraining pipeline
Scope violation: Model used for a task or population it was not validated for — either by users or by feature expansion without re-validation
Monitoring failure: The failure mode existed before detection but monitoring did not surface it — because metrics were not disaggregated, alerts were set too broadly, or the failure was gradual
Human override failure: Human review step that should have caught the AI error failed — over-reliance on AI, inadequate training, or review step bypassed

Checklist: Do You Understand This?

What distinguishes an AI incident from a near-miss, and why should near-misses be recorded?
At P1 severity, what is the required response time and who must be escalated to?
What are the five phases of the AI incident lifecycle in order?
Under what conditions should a model be shut down rather than mitigated?
Name four root cause categories specific to AI incidents (not generic to software incidents).
What must happen before an incident can be formally closed?