Search code examples
dockermountlinux-capabilitieslustre

Mounting Lustre Inside Running Container Not Working (Have Added All Capabilities)


We are trying to mount lustre filesystem inside running container, and have successfully done this via containers which are running in priviledged mode.

However for those containers which are running in non-privilidged mode, mounting lustre failed, even if all capabilites linux provides -- tens of capabilities -- were included!

Then

  1. what is difference between "priviledged: True" and "cap_add: all capabilites"?
  2. Why mounting lustre still fails when all capabilities were added to the container?

Mount Error enter image description here

Non-Privileged Mode Container:

version: "3"
services:
aiart:
cap_add:
  - AUDIT_CONTROL
  - AUDIT_READ
  - AUDIT_WRITE
  - BLOCK_SUSPEND
  - CHOWN
  - DAC_OVERRIDE
  - DAC_READ_SEARCH
  - FOWNER
  - FSETID
  - IPC_LOCK
  - IPC_OWNER
  - KILL
  - LEASE
  - LINUX_IMMUTABLE
  - MAC_ADMIN
  - MAC_OVERRIDE
  - MKNOD
  - NET_ADMIN
  - NET_BIND_SERVICE
  - NET_BROADCAST
  - NET_RAW
  - SETGID
  - SETFCAP
  - SETPCAP
  - SETUID
  - SYS_ADMIN
  - SYS_BOOT
  - SYS_CHROOT
  - SYS_MODULE
  - SYS_NICE
  - SYS_PACCT
  - SYS_PTRACE
  - SYS_RAWIO
  - SYS_RESOURCE
  - SYS_TIME
  - SYS_TTY_CONFIG
  - SYSLOG
  - WAKE_ALARM

image: test_lustre:1.1
#privileged: true
ports:
  - "12345:12345"
volumes:
  - /home/wallace/test-lustre/docker/lustre-client:/lustre/lustre-client

Solution

  • The difference with --privileged and all-capabilities is, that --privileged argument removes all limitations enforced by cgroup controller and disables security enchantments while providing access for all devices. Privileged container truly becomes part of the host operating system, and has access even into AppArmor and SELinux configurations, which might not be applied, such as SELinux labels.

    When --privileged flag is used, it does not enforce any extra security for underlying container, and kernel filesystem is not mounted as read-only into container. SECCOMP filtering is disabled as well. Still, you can't get more power than current namespace allows, for example if you are running rootless daemon.

    Capabilities are way to adjust the power of root, but still some security enchantments are applied when container is executed.

    One great blog post by Red Hat is available in here.

    As pointed out in other answer, AppArmor is probably the issue in this case, and by using --security-opt apparmor:unconfined flag when running container, mounting might be possible. However, that should be used only temporally.