Observability · Analytics
Operational Intelligence
Trends, AI accuracy and reliability KPIs across services and environments.
Filters
Time range
Environment
Severity
Category
5 incidents
Total incidents
5
-12%vs previous period
Critical
1
+1active P1
Avg MTTR
23m
-18%improvement
AI confidence
87%
+3 ptsmodel v4.2
SLA compliance
60%
+2.4%resolved within SLO
Incident volume & MTTR
Correlated incident count and mean time to resolution
Incidents
MTTR (min)
Severity distribution
Breakdown of incidents by severity level
Incidents by category
Top failure domains across selected scope
Environment split
Distribution across deployment tiers
AI accuracy by category
Model precision across failure domains (%)
Top impacted systems
Services most frequently appearing in incident blast radius
payments-api
1checkout-web
1stripe-webhook
1order-events
1fulfillment-svc
1auth-svc
1 AI insights
Auto-generated observations on the selected scope
MTTR is down 18% vs previous period — fastest improvement on Database category.
Auth incidents trending up after recent cert-rotation policy change.
Kafka consumer lag is the fastest-growing failure pattern this month.
SLA compliance at 60% — above 95% target across Prod scope.