Effects of dropping FSETID capability in Docker container

I'm exploring Docker container capabilities and permissions and I'm curious about the implications of dropping the FSETID capability. When I run a Docker container with the --cap-drop FSETID option, I know about the SUID and GUID that are related to file and directory ownership permissions.

Can anyone provide a clear explanation of what the FSETID capability allows within a Docker container by default? What specific actions or operations might be restricted or impacted within the container after dropping the FSETID capability? I've noticed that even when running the Docker container with a non-root user and the --cap-drop FSETID option, I can still change file and folder permissions, and even run commands like passwd. Why is this possible despite the capability being dropped? I'd very thankful if you tell me a scenario that after dropping FSETID capability, it will change compared with the existence of this capability.

I created a new image with docker file below

FROM ubuntu:latest

RUN useradd -m -s /bin/bash myuser
RUN echo 'myuser:mypassword' | chpasswd
RUN apt update && apt install sudo
RUN usermod -aG sudo myuser
USER myuser

CMD ["bash"]

I also created a container with the following command, but I didn't notice any restrictions on the new container.

docker run -it --cap-drop FSETID new_image bash

Solution

This copy of capabilities(7) indicates that CAP_FSETID

Don't clear set-user-ID and set-group-ID mode bits when a file is modified;

set the set-group-ID bit for a file whose GID does not match the filesystem or any of the supplementary GIDs of the calling process.

Normally if a file has the setuid or setgid bits set, then modifying the file causes those permission bits to be reset. This applies to modifying files like /usr/bin/passwd, not just running them.

You can see this by running

host# docker run --rm -it ubuntu bash
root@0123456789ab:/# ls -l /usr/bin/passwd
root@0123456789ab:/# echo '' >> /usr/bin/passwd
root@0123456789ab:/# ls -l /usr/bin/passwd

If you run this with --cap-drop FSETID then you will see in the last line that the modified passwd binary has lost the setuid bit.

In the example you show, you are not modifying a setuid binary and so you are not hitting this case at all. You are running setuid binaries but those don't trigger the behavior at all. Furthermore, the places you run passwd and usermod are in the image build, which happens before any docker run options are considered, so you still have a default capability set.

I'd consider it extremely unusual for a container to modify its binaries at runtime and it'd almost always be safe to drop this capability. (You also should not need the passwd invocation in the Dockerfile or to install sudo, and it's be safe to configure the user with a default shell or no shell at all.)