Search code examples
dockercheckpoint

Live checkpoint with criu or docker for game applications


I have a problem regarding live checkpointing and restoring using criu and docker.

Currently I am using ubuntu 16.04 and I want to implement a feature to make users checkpoint their current game application states. For example, the user can checkpoint anytime when playing the game. However, it failed when I tried to use criu dump with the game written by sfml.

To be more specific, I used a game written in sfml and criu 2.x under ubuntu 16.04. When I start the game, I tried to use the dump functionality provided by criu as the following:

criu dump -D img -t 2833 --shell-job --tcp-established --ext-unix-sk --external unix[33323] --external unix[33326] --external unix[33316] --external unix[33317]

It seems that criu memory and file dump does not support live checkpointing some specific devices such as video card with the path: /dev/dri/card0. Thus, I modified the criu source code to make it skip some part of devices when some unsupported devices are encountered:

criu/parse_proc.c, line 721

However, an error happened when I was trying to restore with the following command:

criu restore -D img -t 2833 --shell-job --tcp-established --ext-unix-sk --inherit-fd fd[10]:socket:[33326] --inherit-fd fd[3]:socket:[33316] --inherit-fd fd[7]:socket:[33317] --inherit-fd fd[8]:pipe:[33323] --inherit-fd fd[9]:pipe:[33323]

The error message says:

Error (criu/files.c:1477): Can't fstat inherit fd 10: Bad file descriptor

I think docker also cannot handle it because it's live checkpointing mechanism is built on top of criu. I was wondering is there any possible way to handle it or does anyone did such things before in docker?

Thanks


Solution

  • Using Docker CLI for checkpointing and restoring is much easier and less problems. You can try: https://forums.docker.com/t/docker-checkpoint-restore-on-another-host/27427/2 and https://github.com/docker/cli/blob/master/experimental/checkpoint-restore.md

    Specifically, the experimental mode in docker has to be enabled. Checkpoint:

    sudo docker checkpoint create --checkpoint-dir=<dir-checkpoint-files> <container-ID> <checkpoint-name>
    

    Restore:

    sudo docker create --name <container-name> <container-image> 
    sudo docker start --checkpoint=<checkpoint-name> --checkpoint-dir=<dir-checkpoint-files> <container-name>
    

    I tried to use criu for checkpointing and restoring; however, it caused a lot of problem like your.