AIOps and Event Intelligence feature issues
This summary is created by Generative AI and may differ from the actual content.
Overview
On May 12, 2026, between 11:50 UTC and 14:55 UTC, PagerDuty customers in the US service region experienced a localized degradation of the AIOps and Event Intelligence features. Specifically, the Intelligent Change Correlation and Probable Origin insights on the Incident Details page failed to render for a subset of incidents (affecting fewer than 5 % of incidents). Core PagerDuty functionality—including event ingestion, incident processing, and notification delivery—remained operational. The issue stemmed from a data‑pipeline refresh task that incorrectly reported successful completion without persisting updated correlation data, leading to missing insights in the UI. Engineering identified the problem, manually restored the data, and re‑executed the pipeline, restoring the insights by 16:15 UTC. Subsequent improvements are being implemented to enhance pipeline failure detection, introduce dedicated operational workflows for data promotion, and add upstream safeguards.
Impact
The degradation was limited to the AIOps and Event Intelligence UI insights (Intelligent Change Correlation and Probable Origin) and impacted fewer than 5 % of incidents. No impact was observed on core PagerDuty services such as event ingestion, incident processing, or notification delivery. Customer experience was affected only insofar as the specific insights were unavailable during the outage window.
Trigger
A data‑pipeline refresh task responsible for updating correlation data incorrectly reported successful completion while failing to write the updated data to the expected storage locations. This mis‑reporting caused the AIOps incident correlation capability to serve stale or missing data, resulting in the UI insights not rendering.
Detection
Engineering teams observed the absence of Intelligent Change Correlation and Probable Origin insights on the Incident Details page. Internal monitoring flagged the discrepancy between expected data availability and actual UI rendering, prompting investigation and identification of the faulty pipeline refresh task.
Resolution
The engineering team manually restored the missing correlation data and re‑executed the data‑pipeline refresh process. By 16:15 UTC, the affected insights were restored and the UI functionality returned to normal. Additional corrective actions include improving pipeline failure detection and establishing dedicated operational workflows for data promotion.
Root Cause
A flaw in the data‑pipeline refresh process allowed tasks to report successful completion without persisting updated data, combined with insufficient failure detection mechanisms that did not catch the unsuccessful data write. This combination led to the temporary loss of AIOps and Event Intelligence insights.
