I have a Digital Ocean droplet where I run 4 containers with one small python application each.
From time to time (once in a week or two), all the containers just stop working. It's not caused by the python apps inside of them.
I've made a systemd timer that executes a bash script every 30 min to check if containers are running, and if not, starts them. The timer was working for days, and it never had to restart a container.
But, one day I ssh to my droplet and see that the containers are stopped -- and
systemctl list-timers --all
shows me that the timer disappeared from system timers! It's just not there anymore!
The container-checking script was writing logs, and the logs stop at the same time when the containers were stopped.
Questions:
How do I figure out what stops my containers?
How is it possible that the systems timer just disappeared?
How do I fix this?
I am the only one who can ssh to that droplet, so someone else couldn't mess it up.
CoreOS clusters reboot themselves when new versions of the operating system become available. That means if you're starting a process on a CoreOS machine manually, at some point it might disappear.
The good news is, there is a standard way to run processes on CoreOS that will come back up when the machine does - that is, you can use systemd units. CoreOS describes what units are, and how to use them here: https://coreos.com/docs/launching-containers/launching/getting-started-with-systemd/
Briefly, you can create your own units in three steps:
Putting a file with a special format in /etc/systemd/system - the simplest one is probably something like
[Unit]
Description=MyApp
After=docker.service
Requires=docker.service
[Service]
ExecStart=/usr/bin/docker run mycontainer
[Install]
WantedBy=multi-user.target
Then, you'll want to set up your system so that it will read that file (and run your container) with
$ sudo systemctl enable foo.service
$ sudo systemctl start hello.service
The document in the link has a lot more detail (I'd strongly recommend taking a look at it before going ahead - it's short!)