What happens when the load balancer fails?
Before diving in depth, let's just see what a typical system architecture looks like.
We generally have
Clients: Responsible for sending requests to the server.
Server: The actual machine on which the processing happens. There can be single or multiple machines to handle the load as required.
Load Balancer: Distributes the requests among the many servers based on a variety of factors like server usage, consistent hashing, etc.
API Gateway: Validates the incoming requests and routes them to a specific server cluster.
Whenever a client sends a request on the internet, the request travels all the way to the API gateway. The API gateway then validates the authenticity of this request and routes this request to a server cluster. This server cluster has a load balancer at its gate. All the servers are connected to this load balancer. This load balancer then routes the request internally via to of the many servers.
How does a load balancer work?
Load balancing is a technique used to distribute workloads evenly across a number of computing resources. Load balancers provide a single access point for clients to interact with the system and distribute the workload to ensure that each server is not overloaded.
To know more about load balancers and how they work, checkout this blog: https://www.thegeekyminds.com/post/how-do-load-balancers-work-what-is-consistent-hashing-system-design
There are two types of load balancers: hardware and software.
Hardware load balancers are installed on dedicated hardware devices and are more expensive than software load balancers, but they offer better performance and reliability.
Software load balancers can be installed on any device with an internet connection, but they can have security vulnerabilities that need to be patched periodically.
From the above arrangement, it is clear that the load balancer is the single interaction point for a client. Hence it may seem that the load balancer is a single point of failure. Because, if the load balancer fails, how will the request get routed to the servers? Well, this thought process is correct to some extent. The load balancer can act as a single point of failure. In the above arrangement if the load balancer fails, the entire system application will collapse.
Having said that, it is possible to avoid such system collapses, incase a load balancer fails. The idea is to have a redundant load balancer in place.
Redundancy is often used in system design as a safeguard when a component fails. The idea is to have a standby load balancer, and this load balancer will become active in case the original load balancer fails. Having this load balancer cluster increases the availability of the overall system and reduces the chances of downtime. If all the load balancers in the cluster fail, you will have downtime in your system.
Now if you are using a managed service, like AWS - you may be wondering "What happens if the AWS load balancer fails?"
What happens if the AWS load balancer fails?
Since AWS is a managed solution, there's nothing much you can do. You can either wait for AWS to fix its load balancer service, or if you have your AWS load balancer spread across various regions, the chances of all the elastic load balancers failing reduces to a great extent.
Alternatively, if the load balancer is down, you can temporarily modify the API Gateway to route the traffic directly to the individual servers or machines.
In short, the Amazon Elastic Load Balancers are failure Proof only if they are set up in multiple availability zones.