I'm hoping that there are some docker swarm experts out there who have configured a load balancer to front a docker swarm multi-node setup. In such a simplified architecture, if the load balancer needs to detect if a manager node is down and stop routing traffic to it, what is the "best practice" for that? Does Docker swarm provide a health endpoint (api) that can be tested for each manager node? I'm new to some of this and there doesn't seem to be a lot out there that describes what I'm looking for. Thanks in advance
There is the metrics endpoint of the engine, and then the engine api, but I don't think that's what you want by an application load balancer.
What I see most people do is put a load balancer in front of the Swarm nodes they want to handle incoming traffic for specific apps running in services, and since that LB needs to know if the containers are responding (not just the node's engine health) they should hit the apps health endpoint, and take nodes in and out of that apps LB based on the app response.
This is how AWS ELB's work out of the box, for example.
If you had a published service on port 80 in the Swarm, you would setup your ELB to point to the nodes you want to handle incoming traffic, and have them expect a healthy 200/300 return on those nodes. It'll remove nodes from the pool if they return something else or don't respond.
Then you could use a full monitoring solution that checks node health and optionally respond to issues like replacing nodes.