This incident has been resolved. Root cause was an unexpected burst of traffic in the region. When this occurred, a node within the proxy layer failed due to lack of resources. Specifically, the number of open connections reached the server's OS-level setting for maximum open files, causing it to stall and fail to open subsequent connections. As a result, error rates spiked for users whose requests were routed through this server. We made the necessary system level setting changes and are in the process of adding capacity to further handle high-traffic users.
Posted about 1 year ago. May 24, 2017 - 15:42 UTC
We have a solution in place and are monitoring progress right now. Impacted users should start seeing performance improvements.
Posted about 1 year ago. May 24, 2017 - 15:23 UTC
We have identified the underlying issue and are working to expand capacity.
Posted about 1 year ago. May 24, 2017 - 14:45 UTC
We have received an alert about a load-related issue in the US-East region and are investigating.