Informational message: [RESOLVED] Elevated error rates
We can confirm that some customers experienced elevated error rates for requests between 2:05 PM PST and 2:20 PM PST. The issue is now resolved.
A sobering look at the reliability of the cloud.
Archive for the ‘Amazon Simple Queue Service (US)’ Category.
We can confirm that some customers experienced elevated error rates for requests between 2:05 PM PST and 2:20 PM PST. The issue is now resolved.
In our investigation we discovered an SSL configuration issue that caused increased latencies. We’ve updated the configuration and SSL latencies have returned to normal.
We can confirm that the increased latencies are limited to customers using SSL connections. Customers not using SSL connections are unaffected. We continue to work to reduce connection latencies for SSL.
We are working to reduce increased connection latencies.
We are investigating increased connection latencies.
We’re investigating increased errors.
Between 2:10 PM PDT and 4:15 PM PDT today, a small number of queues experienced elevated error rates and latencies. Our monitoring correctly alerted us of this issue and we took steps to mitigate impact to customers. The underlying issue was caused by a recent deployment to a small number of hosts. This deployment, along with an unusual traffic pattern this afternoon, triggered an edge case that resulted in the increased error rates. We have since rolled back this deployment and are in the process of updating our test procedures to take this traffic pattern into account for future deployments. The service is now operating normally.
We’ve taken a number of steps to mitigate the impact of this issue and are actively working to address the underlying cause.
We’ve confirmed that some customers are experiencing an increase in 500 Internal Server errors for requests to a small subset of queues.
We are investigating elevated error rates.