Search code examples
dockerfile-permissions

File ownership after docker cp


How can I control which user owns the files I copy in and out of a container?

The docker cp command says this about file ownership:

The cp command behaves like the Unix cp -a command in that directories are copied recursively with permissions preserved if possible. Ownership is set to the user and primary group at the destination. For example, files copied to a container are created with UID:GID of the root user. Files copied to the local machine are created with the UID:GID of the user which invoked the docker cp command. However, if you specify the -a option, docker cp sets the ownership to the user and primary group at the source.

It says that files copied to a container are created as the root user, but that's not what I see. I create two files owned by user id 1005 and 1006. Those owners are translated into the container's user namespace. The -a option seems to make no difference when I copy the file into a container.

$ sudo chown 1005:1005 test.txt
$ ls -l test.txt
-rw-r--r-- 1 1005 1005 29 Oct  6 12:43 test.txt
$ docker volume create sandbox1
sandbox1
$ docker run --name run1 -vsandbox1:/data alpine echo OK
OK
$ docker cp test.txt run1:/data/test1005.txt
$ docker cp -a test.txt run1:/data/test1005a.txt
$ sudo chown 1006:1006 test.txt
$ docker cp test.txt run1:/data/test1006.txt
$ docker cp -a test.txt run1:/data/test1006a.txt
$ docker run --rm -vsandbox1:/data alpine ls -l /data
total 16
-rw-r--r--    1 1005     1005            29 Oct  6 19:43 test1005.txt
-rw-r--r--    1 1005     1005            29 Oct  6 19:43 test1005a.txt
-rw-r--r--    1 1006     1006            29 Oct  6 19:43 test1006.txt
-rw-r--r--    1 1006     1006            29 Oct  6 19:43 test1006a.txt

When I copy files out of the container, they are always owned by me. Again, the -a option seems to do nothing.

$ docker run --rm -vsandbox1:/data alpine cp /data/test1006.txt /data/test1007.txt
$ docker run --rm -vsandbox1:/data alpine chown 1007:1007 /data/test1007.txt
$ docker cp run1:/data/test1006.txt .
$ docker cp run1:/data/test1007.txt .
$ docker cp -a run1:/data/test1006.txt test1006a.txt
$ docker cp -a run1:/data/test1007.txt test1007a.txt
$ ls -l test*.txt
-rw-r--r-- 1 don  don  29 Oct  6 12:43 test1006a.txt
-rw-r--r-- 1 don  don  29 Oct  6 12:43 test1006.txt
-rw-r--r-- 1 don  don  29 Oct  6 12:47 test1007a.txt
-rw-r--r-- 1 don  don  29 Oct  6 12:47 test1007.txt
-rw-r--r-- 1 1006 1006 29 Oct  6 12:43 test.txt
$ 

Solution

  • In order to get complete control of file ownership, I used the tar stream feature of docker cp:

    If - is specified for either the SRC_PATH or DEST_PATH, you can also stream a tar archive from STDIN or to STDOUT.

    I launch the docker cp process, then stream a tar file to or from the process. As the tar entries go past, I can adjust the ownership and permissions however I like.

    Here's a simple example in Python that copies all the files from /outputs in the sandbox1 container to the current directory, excludes the current directory so its permissions don't get changed, and forces all the files to have read/write permissions for the user.

    from subprocess import Popen, PIPE, CalledProcessError
    import tarfile
    
    def main():
        export_args = ['sudo', 'docker', 'cp', 'sandbox1:/outputs/.', '-']
        exporter = Popen(export_args, stdout=PIPE)
        tar_file = tarfile.open(fileobj=exporter.stdout, mode='r|')
        tar_file.extractall('.', members=exclude_root(tar_file))
        exporter.wait()
        if exporter.returncode:
            raise CalledProcessError(exporter.returncode, export_args)
    
    def exclude_root(tarinfos):
        print('\nOutputs:')
        for tarinfo in tarinfos:
            if tarinfo.name != '.':
                assert tarinfo.name.startswith('./'), tarinfo.name
                print(tarinfo.name[2:])
                tarinfo.mode |= 0o600
                yield tarinfo
    
    main()