Originally Posted by
Galkon
A good practice is for the client to do as little as possible. If you depend on the client correctly pinging and choosing which server to connect to, you cannot control any of that logic until you update the client again and anyone using it updates. Maybe this is fine for a website, but if you put it behind a load balancer and the load balancer can be configured regardless of client version, you'd have more control.
If you main DB goes down (let's say it's where you execute writes) and you have 2-3 read replicas, your site would still function 100% normally at a read-only level. Only when writes needed to go through would users experience issues.
While I understand optimizing for being able to continuously serve your users during downtime, I would be curious why you are optimizing so much for downtime and ability to spin up new nodes and for databases to be regularly failing? It's important to know how to mitigate downtime, but not what you should be designing for imo, you've described a fairly simple service (clients connecting to REST APIs).