Investigating Potential Issue

Severity: Major
Category: Scalability
Service: PagerDuty

This summary is created by Generative AI and may differ from the actual content.

Overview

On October 9, 2022, PagerDuty experienced a failure in the event dispatching endpoint for a US region's inbound integrations, affecting the processing of event data.

Impact

The X-ERE Global Events in the US region were non-functional from 16:12 to 16:46 GMT, impacting the ability to process incoming event data and returning 500 responses for failed events.

Trigger

A spike in resource usage in one of the data pipeline components forced it to stop processing part of the incoming event data.

Detection

The issue was identified through reports of a potential problem, leading to an investigation.

Resolution

A manual restart of the service was performed, restoring functionality by 16:46 GMT.

Root Cause

The automated recovery process failed to trigger, which is under investigation to prevent future occurrences.