Incident with Issues, Git Operations and API Requests

Severity: Major
Category: Network
Service: GitHub

This summary is created by Generative AI and may differ from the actual content.

Overview

On March 3rd 2025 between 04:07 UTC and 09:36 UTC various GitHub services were degraded with an average error rate of 0.03% and peak error rate of 9%. This issue impacted web requests, API requests, and git operations. Initial reports of degraded performance started at 04:20 UTC, intermittent timeouts were identified at 04:23 UTC, and updates regarding Webhooks and Git Operations operating normally were provided. In response to this incident, we are improving our monitoring capabilities to identify and respond to similar silent errors more effectively in the future.

Impact

various GitHub services were degraded with an average error rate of 0.03% and peak error rate of 9%. This issue impacted web requests, API requests, and git operations.

Trigger

a network node in one of GitHub's datacenter sites partially failed, resulting in silent packet drops for traffic served by that site.

Detection

At 09:22 UTC, we identified the failing network node

Resolution

at 09:36 UTC we addressed the issue by removing the faulty network node from production.

Root Cause

a network node in one of GitHub's datacenter sites partially failed, resulting in silent packet drops for traffic served by that site.