In general, I want to understand in a distributed application - is the load balancer a single point of failure?
I am not sure, but this can be an Apache load balancer or on top of that a device/hardware load balancer as provisioned from F5 Network, etc.
I have seen(on papers/slides) that designs can have multiple Apache load balancers for the same application.
I had a discussion with my colleague - mapping multiple IP Addresses/VMs/unix boxes(having load balancer hardware device) to the same DNS domain (like www.amazon.com) - but then who is going to take care of what basis/algorithm request will go to which particular IP/Unix box(which are mapped to amazon.com/DNS)
My question: At the start of the request flow (at the first entry point) - there is only one machine(which sends requests to underneath load balancers on the basis of some algorithm) and if this machine fails, distributed system(having multiple load balancers and clusters, etc) will go down
Sorry if I am blowing it out of all proportions.
Having in mind the definition of a single point of failure (SPOF), if your LB fails your application will be unavailable so in short, yes a single LB or reverse proxy is a SPOF.
Why it is?, assuming you have only one LB and it's yet able to handle easily all the traffic you might have, you need also to be sure that you are safe from any hardware failure or any other kind of failure that might get your device down (extreme situation data center collapse).
How to handle the problem?
I'll just mention here that just adding layers in front of your application servers doesn't necessarily solve all your problems, instead, you are adding "network hops" which have as a result, even a minor, time overhead in every request. Also sometimes makes troubleshooting harder, increases the costs, and all the other bad things a complex infrastructure brings. That's why i would need a very good reason for having different LBs in line.
To the point, an architecture that I would follow (similar to that you've seen on papers as said) is two LBs in front of your infrastructure (more than two only if they have difficulties handling your traffic) and DNS load balancing between them.
Well of course this solution has drawbacks, DNS is agnostic about the state of your backend so you don't have failover functionality.
You can address that, by using a robust monitoring system in cooperation with your DNS in order to accomplish an automatic change to DNS and this way a failover functionality. Again you have to accept that DNS is bound to Time To Live (TTL) and some clients will have cached the "wrong" ip at a time of failure.
Well as you realize the above is not perfect but probably (most times) is your only way around.
For situations where there is even less tolerance for downtime (even for a subset of clients), I will leave a couple of alternatives.
Global Server Load Balancer (GSLB), it's a service and like this, you'll buy it. It does the hard job its always there to route the traffic as you wish, either to an Active-Passive architecture let's say Primary-Disaster or Active-Active for instance one data center in the USA and another in Asia. Of course, this solution (except that will cost quite a lot) sounds easy to implement, although keep in mind all the things you have to consider in order for this to work properly I won't get deep into technical I'll just mention that you will need the double hardware which will have to configure it to work independently between your data centers yet though in complete sync where is needed.
Border Gateway Protocol (BGP), you will have to implement this with your ISPs. Implementation here might be quite complex and it has to be custom in order to be optimized for your needs. Again here as before you have all the headaches of double infrastructure. But if you have come down to this solution most probably you will be up and running in more than one place.
Having said all the above a single powerful LB hosted in the cloud will be enough for the majority of web apps/sites.