
AWS Confirms Data Center Outage Caused By ‘Thermal Event,’ Some Services Still Impacted
Why It Matters
The incident highlights the fragility of even the most robust cloud infrastructure, prompting enterprises to reassess multi‑region redundancy and disaster‑recovery strategies. Ongoing service degradation can affect high‑profile fintech and gaming platforms, potentially eroding trust in AWS’s reliability.
Key Takeaways
- •Thermal cooling failure halted power in US‑East‑1 data center
- •EC2, EBS, SageMaker, Redshift and other services impaired
- •Coinbase's new AI‑payment service disrupted for seven hours
- •AWS previously faced drone attacks on Middle East data centers
Pulse Analysis
The May 8 thermal event at AWS’s US‑East‑1 region underscores how physical‑layer incidents can cascade into large‑scale cloud outages. A cooling system malfunction triggered a loss of power, forcing the provider to throttle compute, storage and analytics services that power millions of workloads worldwide. While AWS restored power to a subset of the infrastructure within hours, lingering impairments in EC2, EBS, SageMaker and related services illustrate the challenges of rapid recovery in densely packed data centers where redundancy is often limited to the availability‑zone level.
High‑visibility customers felt the impact immediately. Coinbase, which recently launched an AI‑driven payment service in partnership with AWS, experienced a seven‑hour interruption that delayed transaction processing for users. The incident forced the exchange to fall back on snapshots and redeploy resources in unaffected zones, a reminder that even cutting‑edge fintech platforms must maintain robust cross‑region failover plans. FanDuel and other consumer‑facing firms reported similar disruptions, highlighting the broader risk to sectors that rely on real‑time data pipelines and low‑latency compute.
This outage follows a series of AWS disturbances this year, including drone attacks on facilities in the United Arab Emirates and Bahrain that caused power outages and water damage. Combined with the thermal event, these incidents raise questions about the resilience of the cloud market leader, which posted $37.6 billion in Q1 2026 revenue and a $150 billion annual run rate. Enterprises are likely to reevaluate multi‑cloud strategies, diversify workloads across regions, and demand greater transparency on physical‑security measures. As cloud adoption accelerates, the ability to anticipate and mitigate infrastructure‑level failures will become a decisive factor in vendor selection.
AWS Confirms Data Center Outage Caused By ‘Thermal Event,’ Some Services Still Impacted
Comments
Want to join the conversation?
Loading comments...