Incident with CodeQL, Webhooks, Notifications, and Slack Integration
This summary is created by Generative AI and may differ from the actual content.
Overview
Incident with CodeQL, Webhooks, Notifications, and Slack Integration causing processing delays due to replication lag and insufficient worker capacity.
Impact
53% of CodeQL check runs took >15 min; notifications avg 22 min delivery; Slack webhooks avg 20 min delivery; services experienced degraded performance.
Trigger
Replication lag from an internal database migration leading to insufficient worker capacity for high job enqueue rate.
Detection
Degraded performance was observed as delays in CodeQL actions, notifications, webhooks, and Slack integration, prompting investigation and updates.
Resolution
Scaled processing workers to handle increased load; added capacity; plan to create dedicated worker pools for high‑usage queues.
Root Cause
Replication lag from database migration caused queue backlogs and worker shortage, resulting in delayed processing across services.
