Search code examples
dockerdocker-swarm

All docker stack are restarting automatically


I have a multi-services environment that is hosted with docker swarm. There are multiple stacks that are created. All the docker containers which are running have an inbuild Spring Boot application. The issue is coming that all my stacks get restarted on their own. Now I know that in compose file I have mentioned that restart_policy as on failure. Hence it auto restarted. The issue comes that when services are restarted, I get errors from a particular service and this breaks everything. I am not able to figure out what actually happens. I did quite a lot of research and found out about these things.

  1. Docker daemon is not restarted. I double-checked this with the uptime of the docker daemon.
  2. I checked the docker service ps <Service_ID> and there I can see service showing shutdown and starting. No other information.
  3. I checked the docker service logs <Service_ID> but no error in there too.
  4. I checked for resource crunch. I can assure you that there was quite a good resource available at the host as well as each container level. Can someone help where exactly to find logs for this even? Any other thoughts on this?

My host is actually a VM hosted on VMWare Vcenter.


Solution

  • After a lot of research and going through all docker logs, I could not find the solution. Later on, I discovered that there was a memory snapshot taken for backup every 24 hours. Here is what I observe:

    1. Whenever we take a snapshot, all docker services running on the host restart automatically. There will be no errors in that but they will just restart gracefully.
    2. I found some questions already having this problem with VMware snapshots.
    3. As far as I know, when we take a snapshot, it points to a different memory location and saves the previous one. I am not able to find why it's happening but yes Root cause of the problem was this. If anyone is a VMWare snapshots expert, please let us know.