Jesse Pollak, the head of engineering for consumer products at Coinbase, is revealing what caused two outages that paralyzed the leading digital currency exchange during moments of peak trading volume and price action.
In a new blog post published, Pollak says the first incident on April 29th was initially caused by an increase in the rate of new connections hitting its primary databases.
“When this spike in connections occurred, the host operating system for the database began rejecting new TCP connections to the host, which triggered degraded operations and restarts in the routing layer for the database.”
Initial attempts to mitigate the situation were unsuccessful, and users had trouble accessing Coinbase during two separate periods throughout the day.
Another outage hit the exchange during a surge in traffic on May 9th as market volatility picked up. Pollak says the outage that day was linked to increased latency across outgoing HTTP requests that ultimately triggered a throttling of DNS queries.
Pollak says Coinbase is rolling out improvements to prevent similar incidents in the future.
“Both of these incidents impacted our ability to serve Coinbase customers at critical moments. One of our company values is continuous learning and we are committed to taking the learnings from these episodes to improve Coinbase.”