🧠 All Things AI
Advanced

AI Incident Detection & Reporting

The earlier an AI incident is detected, the less harm it causes. Most AI incidents are not sudden failures — they are gradual drift, slow bias amplification, or persistent quality degradation that only becomes visible once harm has accumulated. Effective detection means monitoring the right signals before users report problems. Effective reporting means knowing which incidents require regulatory disclosure and executing on those obligations within required timelines.

Monitoring Signals for AI Incident Detection

AI monitoring should track signals across three layers: technical model performance, user behaviour, and outcome quality.

Technical signals

  • Model output distribution shift — distribution of predicted classes or scores has changed
  • Input feature distribution shift — incoming data no longer matches training distribution (covariate shift)
  • Confidence calibration degradation — model scores no longer represent true probabilities
  • Latency and error rate spikes — model or serving infrastructure degraded
  • Guardrail trigger rate change — sudden increase in content filter activations may indicate adversarial attack or deployment context change

User and outcome signals

  • User override rate — rate at which human reviewers reject or correct AI outputs increasing
  • User complaint and feedback volume — spike in reports of incorrect, harmful, or inappropriate outputs
  • Downstream outcome metrics — if AI assists with decisions, track whether outcomes are degrading (e.g., increased loan default rates if a credit model is drifting)
  • Funnel drop — unusual drop-off at AI-assisted steps in a user journey
  • Fairness metric drift — per-subgroup performance diverging from the baseline established at deployment

Alert Thresholds and Baselines

Monitoring without thresholds produces noise. Monitoring without baselines produces false positives. Effective alert configuration requires:

  • Establish baselines at deployment: Record the distribution of key metrics (output scores, confidence, per-subgroup accuracy) at the time the model is deployed to production. This is the reference for drift detection.
  • Set statistical thresholds: Alert when a metric has moved by more than N standard deviations from its rolling baseline, or has crossed an absolute threshold defined as acceptable at the risk assessment stage.
  • Avoid alert fatigue: Do not alert on every metric. Prioritise signals that have a high positive predictive value for actual incidents. Review alert hit rates quarterly — if an alert fires constantly without incidents, tune the threshold.
  • Separate warning from incident: Warning = flag for investigation. Incident = confirmed problem requiring response. A warning should trigger investigation, not the full incident lifecycle.

User Reporting Channels

Users are often the first to observe AI failures that monitoring does not catch — particularly for qualitative harms like biased outputs, inappropriate tone, or factually incorrect claims in low-frequency contexts.

Effective user reporting mechanisms

  • In-product thumbs down / flag output button directly on AI-generated content
  • Categorised feedback: inaccurate / harmful / biased / other — not just free text
  • Context preservation: capture the exact input and output with the report, not just the user's description
  • Acknowledgement: tell users their report was received and that it is reviewed

What user reporting alone cannot cover

  • Harms to individuals who are not users (third parties affected by AI decisions)
  • Gradual drift — no single output looks wrong, but the cumulative effect is harmful
  • Harms that users are not aware of (e.g., credit denial due to biased model — user may not know AI was involved)
  • High-frequency low-harm outputs that no individual user finds worth reporting

Escalation Paths

SeverityWho is notifiedTimeline
P1 CriticalAI Risk Owner + CISO + General Counsel + CEO/CPO + Board (if material)Within 1 hour; regulatory notification may be required within 24–72 hours
P2 HighAI Risk Owner + Legal + Product VP + affected business unit headWithin 4 hours; 24-hour update to stakeholders required
P3 MediumModel owner + product team + AI governance functionSame business day; weekly status until resolved
P4 LowModel ownerLogged and triaged within 5 business days

EU AI Act Serious Incident Reporting

The EU AI Act introduces mandatory reporting obligations for "serious incidents" involving high-risk AI systems. These are not optional — non-compliance is subject to fines.

What triggers the obligation

  • A "serious incident" as defined by the Act: death or serious injury to a person; serious damage to property; significant disruption of critical infrastructure; serious infringement of fundamental rights
  • Applies to providers and deployers of high-risk AI systems (Annex III) deployed in the EU
  • Market surveillance authorities (national competent authorities / NCAs) in each EU member state are the reporting recipients

Timeline

  • 72 hours from becoming aware of the serious incident: initial notification to the NCA
  • 15 days for follow-up report with investigation findings
  • Final report including root cause and corrective actions once investigation complete

Internal vs External Disclosure Decisions

Disclosure typeWhen requiredDecision owner
Regulator notificationEU AI Act serious incident; GDPR data breach involving AI; sector-specific (FDA, FCA, etc.)Legal/Compliance — not discretionary
Affected individual notificationWhere individual rights have been materially affected by an AI decision; GDPR subject rights triggeredLegal + AI Risk Owner
Public disclosureWhere concealment would worsen harm; where affected party intends to disclose independently; where transparency is required by policy or regulationCEO + Legal — discretionary but default toward disclosure
Customer/partner notificationWhere customer or partner was affected or may be implicated; where contractual obligations require disclosureLegal + Account Management

Checklist: Do You Understand This?

  • Name three technical monitoring signals and three user/outcome signals that can indicate an AI incident.
  • What is the difference between a monitoring warning and a confirmed incident?
  • For a P2 incident, who must be notified and within what timeframe?
  • Under the EU AI Act, what constitutes a "serious incident" and what is the reporting timeline?
  • Why is user reporting alone insufficient for comprehensive AI incident detection?
  • Who owns the decision to make a public disclosure about an AI incident, and what is the default position?