How do I propagate a mount created in a child namespace to the parent?
I am trying to create a tool leveraging overlayfs
for allowing writes over a read only directory. The tricky bit is that I would like any user to be able to use it without root privileges. Therefore I was hoping that this could be achieved with a mount namespace, provided an admin had mounted a directory shared, any user should then be able to create an overlay under that tree that is visible from the parent namespace (so any of the users login shells can see that overlay mount).
Here is what I tried, but is not working:
# admin creates a shared tree for users to mount under
sudo mkdir /overlays
# bind mount over itself with MS_REC | MS_SHARED
sudo mount --bind --rshared /overlays /overlays
Assuming a user then wants to create an overlay over /some/readonly/dir
, they should create /overlays/user/{upper,work,mnt}
. I would expect them to be able to mount an overlay under the /overlays
dir that propagates with the following code.
// user_overlay.c
#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <linux/capability.h>
#include <sys/mount.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
int child(void *args)
{
pid_t p;
p = mount("overlay", "/overlays/user/mnt", "overlay", 0, "lowerdir=/some/readonly/dir,upperdir=/overlays/user/upper,workdir=/overlays/user/work");
if (p == -1){
perror("Failed to mount overlay");
exit(1);
}
// Expose the mount to the parent namespace
p = mount("none", "/overlays/user/mnt", NULL, MS_SHARED, NULL);
if (p == -1){
perror("Failed to mark mount as shared");
exit(1);
}
// Exec bash so I can ensure that the mnt was created
// though in practice I would just daemonize this proc
// such that the mount is visible in the parent
// until this proc is killed
char *newargv[] = { "/bin/bash", NULL };
execv("/bin/bash", newargv);
perror("exec");
exit(EXIT_FAILURE);
return 0;
}
int main()
{
pid_t p = clone(child, malloc(4096) + 4096, CLONE_NEWNS | CLONE_NEWUSER | SIGCHLD, NULL);
if (p == -1) {
perror("clone");
exit(1);
}
// Wait until the bash proc in the child finishes
waitpid(p, NULL, 0);
return 0;
}
Executing gcc user_overlay.c -o user_overlay && ./user_overlay
indeed mounts the overlay in that child process, but /overlays/user/mnt
is not propagated to the parent. However modifications to /overlays/user/upper
are visible from both parent and child.
What you are trying to achieve appears to not be possible, at least not using the above method. You want to grant mounting permissions to an unprivileged user by creating a new user namespace via CLONE_NEWUSER
. However, citing mount_namespaces(7)
(emphasis mine):
Restrictions on mount namespaces
Note the following points with respect to mount namespaces:
* A mount namespace has an owner user namespace. A mount namespace whose owner user namespace is different from the owner user namespace of its parent mount namespace is considered a less privileged mount namespace.* When creating a less privileged mount namespace, shared mounts are reduced to slave mounts. (Shared and slave mounts are discussed below.) This ensures that mappings performed in less privileged mount namespaces will not propagate to more privileged mount namespaces.
This means that the mounts you are creating in fact have slave
propagation type, instead of shared
as you would expect. This results in mount events not being propagated to the parent mount namespace.