Coinbase reviews the May outage incident: AWS cascading failure exposes architectural risks

By: rootdata|2026/06/01 20:45:00
0
Share
copy

Coinbase released a retrospective report on the large-scale service interruption event on May 7, 2026.

The outage lasted approximately 8 hours, with full recovery taking about 12 hours. During this time, trading, deposits, withdrawals, and most core services were unavailable or severely degraded. Coinbase stated that the outage was caused by multiple cooling units failing simultaneously in the cooling system of a data center in one availability zone (use1-az4) in the AWS us-east-1 region, triggering cabinet thermal protection shutdowns, which led to EC2 instances and EBS volumes going offline, affecting multiple internet services.

During the recovery process, the Coinbase trading matching engine lost quorum due to the cluster architecture deployed in a single AWS data center losing most nodes. It required urgent code adjustments and the reconstruction of a new node group to restore operation, gradually restarting market trading during the recovery.

Additionally, the AWS-managed Kafka (MSK) service experienced control plane failures, preventing the automatic re-election of partition leaders, further blocking quotes, fees, and some settlement and data flow systems, which expanded the overall impact.

After manual partition migration in collaboration with the AWS engineering team, the system gradually returned to normal. Coinbase stated that this incident exposed its shortcomings in cross-availability zone automatic switching capabilities and disaster recovery for managed middleware. The company will upgrade its cross-region hot backup architecture, strengthen regular failure drills, and migrate the Kafka system from dual availability zones to a three availability zone deployment, while also working with AWS to advance root cause fixes and improvements.

-- Price

--

You may also like

A valuation of 8 billion dollars, doubling in 8 months! What makes the crypto-friendly bank Erebor Bank stand out?

Erebor is a high-profile experiment taking place at the intersection of banking, cryptocurrency, and industrial policy.

340 billion valuation: Li Yanhong's largest IPO, a seat in Kunlunxin's shares is hard to come by

As a core asset in Baidu's AI landscape, Kunlun Chip is expected to exceed Baidu's market value after going public, becoming an important bargaining chip in its turnaround battle.

Stablecoins are the "royalists" of the crypto world: Open USD brings the old currency system into play

The emergence of Open USD has shifted the competition for stablecoins from the market struggle of crypto startups to a battle for infrastructure involving traditional finance, payment networks, technology platforms, and public chain ecosystems.

Semiconductor stocks plummet, yet Anthropic wants to create a 2nm chip

Abandoning TSMC and teaming up with Samsung. Anthropic launches a self-developed 2nm chip program, challenging Nvidia and starting a battle to break through computing power costs.

Where is Zhao Changpeng's billion-dollar investment going? YZi Labs' investment landscape fully revealed

Zhao Changpeng's billion-dollar new "family office" YZi Labs investment landscape revealed: 70% of the funds are committed to the crypto ecosystem, while 30% are cross-industry bets on AI and biotechnology, launching a new capital experiment in the post-Binance era.

Ethereum Foundation Report: A Basic Guide to Ethereum for Governments and Financial Institutions

The Ethereum Foundation has released this non-technical introductory report aimed at government officials, central banks, regulators, and corporate decision-makers, explaining how Ethereum works, how it is governed, how it differs from other blockchains, and how institutions and governments are alre...

Contents

Popular coins

Latest Crypto News

Read more
iconiconiconiconiconiconicon
Customer Support:@weikecs
Business Cooperation:@weikecs
Quant Trading & MM:bd@weex.com
VIP Program:support@weex.com