Incident with GIT LFS and Other Requests

Severity: Major
Category: Scalability
Service: GitHub

This summary is created by Generative AI and may differ from the actual content.

Overview

On February 6, 2025, between 8:40AM UTC and 11:13AM UTC the GitHub REST API was degraded following the rollout of a new feature.

Impact

Error rate peaked at 100% of requests to the service. Customers may intermittently experience failures to fetch repositories with LFS, as well as increased latency and errors across the API.

Trigger

Rollout of a new feature resulted in an increase in requests that saturated a cache and led to cascading failures in unrelated services.

Detection

Observed load spikes and degraded performance for API Requests.

Resolution

Increased the allocated memory to the cache and rolled back the feature that led to the cache saturation. Scaled out database resources and rolled back recent changes.

Root Cause

New feature resulted in an increase in requests that saturated a cache and led to cascading failures in unrelated services.