I have a problem regarding live checkpointing and restoring using criu and docker.
Currently I am using ubuntu 16.04 and I want to implement a feature to make users checkpoint their current game application states. For example, the user can checkpoint anytime when playing the game. However, it failed when I tried to use criu dump with the game written by sfml.
To be more specific, I used a game written in sfml and criu 2.x under ubuntu 16.04. When I start the game, I tried to use the dump functionality provided by criu as the following:
criu dump -D img -t 2833 --shell-job --tcp-established --ext-unix-sk --external unix[33323] --external unix[33326] --external unix[33316] --external unix[33317]
It seems that criu memory and file dump does not support live checkpointing some specific devices such as video card with the path: /dev/dri/card0
.
Thus, I modified the criu source code to make it skip some part of devices
when some unsupported devices are encountered:
criu/parse_proc.c, line 721
However, an error happened when I was trying to restore with the following command:
criu restore -D img -t 2833 --shell-job --tcp-established --ext-unix-sk --inherit-fd fd[10]:socket:[33326] --inherit-fd fd[3]:socket:[33316] --inherit-fd fd[7]:socket:[33317] --inherit-fd fd[8]:pipe:[33323] --inherit-fd fd[9]:pipe:[33323]
The error message says:
Error (criu/files.c:1477): Can't fstat inherit fd 10: Bad file descriptor
I think docker also cannot handle it because it's live checkpointing mechanism is built on top of criu. I was wondering is there any possible way to handle it or does anyone did such things before in docker?
Thanks
Using Docker CLI for checkpointing and restoring is much easier and less problems. You can try: https://forums.docker.com/t/docker-checkpoint-restore-on-another-host/27427/2 and https://github.com/docker/cli/blob/master/experimental/checkpoint-restore.md
Specifically, the experimental mode in docker has to be enabled. Checkpoint:
sudo docker checkpoint create --checkpoint-dir=<dir-checkpoint-files> <container-ID> <checkpoint-name>
Restore:
sudo docker create --name <container-name> <container-image>
sudo docker start --checkpoint=<checkpoint-name> --checkpoint-dir=<dir-checkpoint-files> <container-name>
I tried to use criu for checkpointing and restoring; however, it caused a lot of problem like your.