Elevated errors for some clusters in US East

Incident Report for Bonsai

Resolved

This incident has been resolved. Root cause was an unexpected burst of traffic in the region. When this occurred, a node within the proxy layer failed due to lack of resources. Specifically, the number of open connections reached the server's OS-level setting for maximum open files, causing it to stall and fail to open subsequent connections. As a result, error rates spiked for users whose requests were routed through this server. We made the necessary system level setting changes and are in the process of adding capacity to further handle high-traffic users.
Posted May 24, 2017 - 15:42 UTC

Monitoring

We have a solution in place and are monitoring progress right now. Impacted users should start seeing performance improvements.
Posted May 24, 2017 - 15:23 UTC

Identified

We have identified the underlying issue and are working to expand capacity.
Posted May 24, 2017 - 14:45 UTC

Investigating

We have received an alert about a load-related issue in the US-East region and are investigating.
Posted May 24, 2017 - 14:21 UTC