Notification Failures
This summary is created by Generative AI and may differ from the actual content.
Overview
On December 3, 2025, PagerDuty experienced a service disruption affecting notification delivery and related platform functions in the EU and US service regions. The incident began at 21:24 UTC and was fully resolved by 00:30 UTC on December 4. No inbound events were lost or dropped during this time.
Impact
Notification delivery in both the EU and US service regions experienced disruptions. Some users also saw intermittent errors when accessing their contact information or when using Live Call Routing. A total of 14,948 messages were undelivered in the EU, with 11,768 being high urgency and 3,180 being low urgency. In the US, 114,768 messages were undelivered, with 103,141 being high urgency and 11,627 being low urgency.
Trigger
A routine deployment was made to a core service within the message delivery system, which introduced a minor change in how certain types of internal network traffic were routed between services. This update caused an inconsistency in the configuration of an upstream service, affecting how certain request headers were processed.
Detection
Automated monitors detected the change in delivery behavior promptly and the on-call team was engaged within two minutes.
Resolution
The team applied a targeted correction after identifying the contributing configuration, and services in both service regions returned to normal operation. A configuration template associated with the update was applied more broadly than intended during the rollback, introducing the same routing condition to the US service region.
Root Cause
The incident was caused by a combination of the routine deployment and the inconsistency in the configuration of the upstream service. The root cause was not a result of application code, but rather an interaction within the deployment system.
