Advanced

AI Incident Detection & Reporting

The earlier an AI incident is detected, the less harm it causes. Most AI incidents are not sudden failures — they are gradual drift, slow bias amplification, or persistent quality degradation that only becomes visible once harm has accumulated. Effective detection means monitoring the right signals before users report problems. Effective reporting means knowing which incidents require regulatory disclosure and executing on those obligations within required timelines.

Monitoring Signals for AI Incident Detection

AI monitoring should track signals across three layers: technical model performance, user behaviour, and outcome quality.

Technical signals

Model output distribution shift — distribution of predicted classes or scores has changed
Input feature distribution shift — incoming data no longer matches training distribution (covariate shift)
Confidence calibration degradation — model scores no longer represent true probabilities
Latency and error rate spikes — model or serving infrastructure degraded
Guardrail trigger rate change — sudden increase in content filter activations may indicate adversarial attack or deployment context change

User and outcome signals

User override rate — rate at which human reviewers reject or correct AI outputs increasing
User complaint and feedback volume — spike in reports of incorrect, harmful, or inappropriate outputs
Downstream outcome metrics — if AI assists with decisions, track whether outcomes are degrading (e.g., increased loan default rates if a credit model is drifting)
Funnel drop — unusual drop-off at AI-assisted steps in a user journey
Fairness metric drift — per-subgroup performance diverging from the baseline established at deployment

Alert Thresholds and Baselines

Monitoring without thresholds produces noise. Monitoring without baselines produces false positives. Effective alert configuration requires:

Establish baselines at deployment: Record the distribution of key metrics (output scores, confidence, per-subgroup accuracy) at the time the model is deployed to production. This is the reference for drift detection.
Set statistical thresholds: Alert when a metric has moved by more than N standard deviations from its rolling baseline, or has crossed an absolute threshold defined as acceptable at the risk assessment stage.
Avoid alert fatigue: Do not alert on every metric. Prioritise signals that have a high positive predictive value for actual incidents. Review alert hit rates quarterly — if an alert fires constantly without incidents, tune the threshold.
Separate warning from incident: Warning = flag for investigation. Incident = confirmed problem requiring response. A warning should trigger investigation, not the full incident lifecycle.

User Reporting Channels

Users are often the first to observe AI failures that monitoring does not catch — particularly for qualitative harms like biased outputs, inappropriate tone, or factually incorrect claims in low-frequency contexts.

Effective user reporting mechanisms

In-product thumbs down / flag output button directly on AI-generated content
Categorised feedback: inaccurate / harmful / biased / other — not just free text
Context preservation: capture the exact input and output with the report, not just the user's description
Acknowledgement: tell users their report was received and that it is reviewed

What user reporting alone cannot cover

Harms to individuals who are not users (third parties affected by AI decisions)
Gradual drift — no single output looks wrong, but the cumulative effect is harmful
Harms that users are not aware of (e.g., credit denial due to biased model — user may not know AI was involved)
High-frequency low-harm outputs that no individual user finds worth reporting

Escalation Paths

Severity	Who is notified	Timeline
P1 Critical	AI Risk Owner + CISO + General Counsel + CEO/CPO + Board (if material)	Within 1 hour; regulatory notification may be required within 24–72 hours
P2 High	AI Risk Owner + Legal + Product VP + affected business unit head	Within 4 hours; 24-hour update to stakeholders required
P3 Medium	Model owner + product team + AI governance function	Same business day; weekly status until resolved
P4 Low	Model owner	Logged and triaged within 5 business days

EU AI Act Serious Incident Reporting

The EU AI Act introduces mandatory reporting obligations for "serious incidents" involving high-risk AI systems. These are not optional — non-compliance is subject to fines.

What triggers the obligation

A "serious incident" as defined by the Act: death or serious injury to a person; serious damage to property; significant disruption of critical infrastructure; serious infringement of fundamental rights
Applies to providers and deployers of high-risk AI systems (Annex III) deployed in the EU
Market surveillance authorities (national competent authorities / NCAs) in each EU member state are the reporting recipients

Timeline

72 hours from becoming aware of the serious incident: initial notification to the NCA
15 days for follow-up report with investigation findings
Final report including root cause and corrective actions once investigation complete

Internal vs External Disclosure Decisions

Disclosure type	When required	Decision owner
Regulator notification	EU AI Act serious incident; GDPR data breach involving AI; sector-specific (FDA, FCA, etc.)	Legal/Compliance — not discretionary
Affected individual notification	Where individual rights have been materially affected by an AI decision; GDPR subject rights triggered	Legal + AI Risk Owner
Public disclosure	Where concealment would worsen harm; where affected party intends to disclose independently; where transparency is required by policy or regulation	CEO + Legal — discretionary but default toward disclosure
Customer/partner notification	Where customer or partner was affected or may be implicated; where contractual obligations require disclosure	Legal + Account Management

Checklist: Do You Understand This?

Name three technical monitoring signals and three user/outcome signals that can indicate an AI incident.
What is the difference between a monitoring warning and a confirmed incident?
For a P2 incident, who must be notified and within what timeframe?
Under the EU AI Act, what constitutes a "serious incident" and what is the reporting timeline?
Why is user reporting alone insufficient for comprehensive AI incident detection?
Who owns the decision to make a public disclosure about an AI incident, and what is the default position?