jamwt
jamwt2w ago

Partial Outage of Convex Deployments

status: one database cluster had a primary failure and is recovering right now, it's affecting a small (but nonzero) number of customers (probably you, @vitorwindberg ). we'll follow up in a few minutes with an update
2 Replies
jamwt
jamwtOP2w ago
most customer should be recovered; we've created an incident on our status page here: https://status.convex.dev/incidents/kb6bh97l9lwl
jamwt
jamwtOP2w ago
This is now fully resolved. The short version is, some of our newer database clusters (post chef) have TONS and TONS of individual deployments on them. upon failover, it can take a long time to recover these projects on the new primary. over the next few months, we're going to invest into changing our database provisioning systems and management systems to be ready for an order of magnitude or more projects on convex in the future, and to be able to operate through failures with very fast recovery times. (there are now 100s of thousands of deployments running in convex's cloud)

Did you find this page helpful?