Love this thing where Google Cloud decides that has been stable for a while and it really ought to do something about that, so it kills the VM and spins up a new one to replace it only *after* it's dead, resulting in ~10 minutes of spurious downtime.

It's been doing this for ~two years, COME ON Google, y'all are supposed to be experts at rollouts. Start new nodes *before* you kill existing ones!

@jepsen Might be a bit overkill, but have you considered hosting your site in kubernetes? In gke, rollouts seem to be managed a bit better, where the default (but configurable) behavior is that extra nodes do get spun up in case of an upgrade, then your pod gets migrated, and only then does the old node get shut down. Or if it's just a static site, why not use something like cloudflare workers sites, i've found that to be extremely convenient to use.


@rior I'm... hoping not to have to redo my entire deployment scheme every 2 years but this appears to be the hellscape we're headed into

mostly it's like

I KNOW AppEngine Flex's problems. What I *don't* know is how GKE is going to ruin my life.

· · Web · 1 · 0 · 0

@jepsen Hmm, gke is probably not the right choice for you then, because serving static files in kubernetes is a bit annoying. But I had my cloudflare workers site up in like an hour. (i was already using cloudflare as my dns, which makes it easier). Their wrangler tool + their quickstart makes it quite easy.

@jepsen Well, then you could put your backend in gke, which is pretty easy to use and setup as far as kubernetes goes, but idk if it's worth saving 10 minutes on downtime every now and then 🤷

Sign in to participate in the conversation

A single-user Mastodon instance for Jepsen announcements & discussion.