Increased Latency and Failures for SSH Git Operations
This summary is created by Generative AI and may differ from the actual content.
Overview
Between 14:00 and 16:10 UTC on 5 May 2026, SSH‑based Git operations experienced elevated latency and intermittent failures (average error rate 0.46 %, peak 0.6 %). HTTP‑based Git operations were unaffected. The degradation was caused by reduced SSH capacity at a single data‑center site; during a period of high traffic the remaining hosts became overloaded, leading to connection exhaustion and some SSH request failures. Additional SSH capacity was provisioned and fully online by 18:18 UTC, resolving the incident.
Impact
The incident affected SSH‑based Git operations, causing elevated latency and intermittent failures for users. HTTP‑based operations continued normally. No further business‑level impact such as revenue loss was reported.
Trigger
Reduced SSH capacity at one data‑center site combined with a period of high traffic caused the remaining SSH hosts to become overloaded, leading to connection exhaustion and the observed latency and failures.
Detection
Monitoring indicated a degradation in SSH Git performance (elevated latency and error rates), prompting the on‑call team to investigate the issue.
Resolution
Additional SSH capacity was provisioned to expand the SSH service, and the new capacity became fully online by 18:18 UTC, restoring normal operation. Post‑incident actions include implementing faster scaling for SSH infrastructure and improving alerting for host availability and capacity thresholds.
Root Cause
Reduced SSH capacity at a single data‑center site, which during high traffic overloaded the remaining SSH hosts, causing connection exhaustion and the observed latency and intermittent failures.
