I'm running a build tool and testing a package inside a podman container. The build tool and tests must not be ran as root but both parts use the system package manager which must be ran as root.
To keep both happy, I'm creating a user (called user
) and adding it to a group (wheel
) which is allowed to use sudo
without a password to run the package manager.
# Dockerfile
FROM alpine
RUN apk add shadow sudo
RUN echo '%wheel ALL=(ALL:ALL) NOPASSWD: ALL' >> /etc/sudoers
RUN useradd --create-home --non-unique --uid 1000 --groups wheel user
This is working great on my native architecture:
$ podman run --arch=x86_64 -it --user=1000:wheel $(podman build -q --arch=x86_64 .) sudo whoami
root
$ podman run --arch=x86_64 -it --user=1000:wheel $(podman build -q --arch=x86_64 .) whoami
user
But as soon as I add qemu architecture emulation into the mix, sudo
refuses to do anything:
> podman run --arch=ppc64le -it --user=1000:wheel $(podman build -q --arch=ppc64le .) sudo --help
sudo: effective uid is not 0, is /usr/bin/sudo on a file system with the 'nosuid' option set or an NFS file system without root privileges?
What does this mean? Why does it have anything to do with qemu and how do I get it to stop moaning and work?
Additional info
~ # ls -a /usr/bin/sudo
-rwsr-xr-x 1 root root 268912 Mar 1 13:55 /usr/bin/sudo
The same message seem to also come from messing with the ownership/permissions of /usr/bin/sudo
. The prescribed fix hasn't made any difference in my case.
This does work with docker instead of podman provided I set --credential=yes
in the magic setup qemu command:
docker run --rm --privileged multiarch/qemu-user-static --reset -p yes --credential yes
Podman has no equivalent of this step however.
N.B. I'm not juggling between users to enhance security so workarounds which compromise security are welcome. Equally welcome are other ways to become root besides sudo (I have tried doas
and pkexec
but they've been even more aggravating).
The issue is fundamentally related to a (mis)configuration of binfmt_misc.
The sudo
command is a setuid binary which always executes as the root user. However, when the sudo
binary is of a foreign architecture and gets executed by binfmt_misc through an interpreter, by default the kernel uses the permissions of the interpreter, which is not setuid.
There is a flag to change the behavior:
C
- credentials
Currently, the behavior of binfmt_misc is to calculate the credentials and security token of the new process according to the interpreter. When this flag is included, these attributes are calculated according to the binary. It also implies theO
flag. This feature should be used with care as the interpreter will run with root permissions when a setuid binary owned by root is run with binfmt_misc.
If you are running podman directly on a Linux system, and your distribution uses systemd-binfmt (Fedora, Ubuntu), you can make overrides of the distribution files to add the flag like this:
for x in /usr/lib/binfmt.d/*.conf; do
sed 's/\(:[^C:]*\)$/\1C/' "$x" | \
sudo tee /etc/binfmt.d/"$(basename "$x")"
done
If you are using podman on macOS or Windows, your containers run on the podman machine virtual machine, which is running a Fedora image. To update the virtual machine configuration, use podman machine ssh
to enter a shell where you can run the above command.