Advanced

AI Centers of Excellence

An AI Center of Excellence (CoE) centralises expertise that would otherwise be rediscovered in parallel across teams — security reviews, model evaluations, reusable components, and training programmes. Done well, a CoE accelerates adoption while enforcing governance. Done poorly, it becomes a bottleneck that slows everything down while providing the illusion of oversight.

CoE vs Federated vs Hybrid

Model	How it works	Advantages	Disadvantages
Centralised CoE	All AI work goes through the CoE; other teams are consumers	Maximum consistency; strong governance; expertise concentrated	Bottleneck; slow to respond to team needs; CoE becomes single point of failure
Federated	Each team builds independently; no central coordination	Fast; teams have full autonomy; no bottleneck	Fragmentation; duplicated effort; inconsistent security and quality; no shared learning
Hybrid (recommended)	CoE provides platform, standards, and fast-path for low-risk; teams build on top with self-service	Speed for teams + consistency + shared expertise; CoE is an enabler not a gatekeeper	Requires investment to build self-service platform; governance boundaries must be clearly defined

CoE Core Responsibilities

Platform and standards

Approved model register — which models are approved for which use cases
Shared infrastructure: LLM gateway (LiteLLM), observability (Langfuse), evaluation harnesses
Reusable component catalog: approved system prompts, RAG configurations, agent templates
Security baseline: PII handling standards, prompt injection defences, audit logging schema
Cost management: centralised budget visibility, chargeback model, guardrail defaults

Governance and enablement

Intake and risk triage for new AI use cases
Security review for high-risk use cases before production
Training programme for practitioners (prompt engineering, responsible use)
Tool evaluation — assess new AI tools and models before teams adopt
Community of practice: Slack channel, lunch-and-learns, internal case studies

CoE Team Composition

Role	Responsibilities	Full-time or embedded
CoE Lead	Strategy, stakeholder management, governance process ownership, escalation path	Full-time
AI/ML Engineer	Platform engineering, evaluation harnesses, component catalog, model selection evaluation	Full-time (1-2 engineers minimum)
Security representative	AI security standards, threat model reviews, vendor security assessment, DPA review	Embedded from security team; 20-30% allocation
Legal/compliance representative	Regulatory requirements (EU AI Act, GDPR), data processing agreements, approved use definitions	Embedded from legal; 10-20% allocation
Product representative	Connects CoE standards to product team needs; prevents CoE from building an ivory tower	Rotating; 1 product manager embedded per quarter

Avoiding the CoE Bottleneck

A CoE that reviews everything will eventually approve nothing quickly

The most common CoE failure mode is becoming a required approval gate for every AI task, regardless of risk. This creates a queue, review fatigue sets in, and approvals become rubber stamps — the worst of both worlds (slow AND no real oversight). The solution is risk-tiered self-service: low-risk use cases (internal productivity tools, pre-approved patterns) have a self-service fast path. The CoE only reviews medium-risk and high-risk use cases. Define the fast-path criteria clearly and publish them.

Fast path criteria (no CoE review needed): internal tool only, uses approved model tier, no PII, no customer-facing, uses a catalog component without modification
CoE review required: customer-facing, processes PII, financial or medical decision support, agentic with irreversible actions, new model or vendor not yet approved
Target review SLA: 3 business days for standard reviews; 1 day for fast path registration

Maturity Stages

Stage 1 — Ad hoc

Teams adopt AI independently; no shared standards; duplicated effort; security gaps unknown

Stage 2 — CoE formed

CoE team established; approved model register created; intake process defined; first security standards published

Stage 3 — Platform built

Shared LLM gateway; observability; component catalog; training programme running; teams onboarded

Stage 4 — Federated with guardrails

Teams build independently on the CoE platform; self-service for low-risk; CoE focuses on standards and high-risk reviews; fast path reduces bottleneck

Stage 5 — Optimising

CoE drives cost optimisation, cross-team learning, advanced capability development; governance is embedded in tooling not process

Checklist: Do You Understand This?

What is the primary failure mode of a centralised CoE — and what does it look like in practice?
What is the hybrid model, and how does it balance governance with speed?
Name five core responsibilities of an AI CoE.
What criteria define a "fast path" use case that bypasses full CoE review?
Why is a rotating product representative on the CoE important — what failure does it prevent?
Describe the maturity progression from ad hoc AI adoption to a fully federated model with guardrails.