Resolved: Issues with EU Service Region

Severity: Critical
Category: Hardware
Service: PagerDuty

This summary is created by Generative AI and may differ from the actual content.

Overview

Between May 13, 2026 at 23:47 UTC and May 14, 2026 at 00:30 UTC, PagerDuty customers in the EU service region experienced a 43‑minute disruption affecting the web application, REST API, notifications, and mobile application. The outage was caused by hardware failures that rendered the primary database writer for the EU region unreachable. During the incident customers encountered login failures, application errors, API failures, and delayed notifications, though no events were lost. Service was restored after promoting a standby database to primary and updating routing configuration.

Impact

The disruption lasted 43 minutes and impacted all EU‑region services: web access, REST API, and event/notification delivery. Customers experienced login failures and error responses, and notifications were delayed but eventually delivered. No data loss occurred, and US‑region customers were unaffected.

Trigger

Hardware failures on the underlying system caused the primary database writer serving the EU service region to become unreachable.

Detection

The incident was identified when customers reported login failures and errors across the web and API services, leading the engineering team to discover that the primary database writer was unreachable.

Resolution

Engineers promoted a dedicated standby database to become the new primary instance, then updated internal routing to direct services to the promoted database. Normal operation resumed shortly thereafter, with full recovery confirmed at 00:30 UTC.

Root Cause

A hardware failure on the primary database writer’s underlying infrastructure caused the writer to lose connectivity, resulting in the service outage for the EU region.