Operations Dashboard
Live AI-assisted incident overview across all environments.
Open incidents
2
-12% vs last week
High severity
3
2 active in Prod
MTTR reduction
−42%
vs 30-day baseline
AI confidence avg
87%
↑ 3 pts
Resolved (30d)
30
avg 23m saved
Incident volume & MTTR (14d)
Auto-refreshedAI Operational Insights
- Repeated DB pool exhaustion in payments — recommend permanent index fix.
- Auth incidents trending up post cert-rotation policy change.
- Kafka consumer lag is the fastest-growing category this month.
- Confidence on Network incidents below team average (74%).
Recent incidents
View all INC-2041
Payments API 5xx spike in eu-west-1
Prod · Database · 3 systems
94%
INC-2039
Kafka consumer lag on order-events
Prod · Messaging · 2 systems
87%
INC-2037
Auth service intermittent 401s after cert rotation
Prod · Auth · 2 systems
91%
INC-2032
Memory leak in recommendation worker
UAT · Application · 1 systems
82%
INC-2028
CDN cache poisoning on /assets/*
Prod · Network · 2 systems
79%