Changes in technology happen relatively quickly these days and every once in a while it’s worth stopping for a moment to ponder current practices and consider alternatives given the options that are currently available.
One of those alternatives I have been thinking about from time to time is the conventional wisdom of running multiple instances of a resource for high availability. When I first learned of such things it was before on demand resources were readily available or even when it took minutes to spin up new virtual machines which would often be an unacceptable amount of down time for a system. Now we’re living in a time where it’s possible to spin up containers in seconds or utilize event driven resources like Amazon Lambda or Google’s Cloud Functions which change the options we have for designing our systems.
On a recent project, we were designing a new system and implementing a multi-VM solution with redundant resources behind load balancers and clusters of datastores etc. This isn’t unusual at all but the system we were working on wasn’t a client facing solution that ends up with a web page anyone clicks on and I brought up the question of why we needed to have multiples of the resources we were using up and running all of the time. One instance was more than capable of handling the load so horizontal scaling really wasn’t necessary. We had auto scaling features available and could spin up new instances pretty quickly and I proposed the concept of simply running one instance of many of our resources. This was met with heavy hesitation and we didn’t end up trying it unfortunately since they’d never run a system like that and didn’t want to risk anything although it did prompt them to think about that.
It’s an interesting concept to think that in many cases you could simply run one instance of a resource and if it ever has trouble simply swap it out with another instance. Good monitoring and supervision become important for this however. This is sort of what you might get if you’re designing a system to run entirely with Google’s preemptible VMs, assuming replacements would always be available. If you are able to design your system to spin up a new instance of a resource on demand with very low latency, then you might question if you need multiple instances running all of the time, if they are not actually needed for other reasons.
This sparked my thinking on an interesting metric I think many companies should have these days, minimum time to deploy. By that I mean what is the minimum time it would take to make a trivial change in the code and have that make its way through whatever processes (code review, build, deployment, etc.) are in place and end up in production. I see this as a useful metric to keep around because it’s essentially your minimum amount of time you could fix any issue you might have in your system, assuming you’re not patching running resources in place or other “quick fix” kinds of things.
So perhaps take a moment to think about what else might possible if you take advantage of modern day conveniences and how that might change parts of your system to reduce complexity and or cost. If you’re still doing things the same way as you have been for a few years it’s probably time to look into all of the new options that are around now.
Also published on Medium.