Search code examples
clinuxlinux-namespaces

How to test user namespace with clone system call with CLONE_NEWUSER flag


Testing the sample from Containerization with LXC to demonstrate User namespace.

It is supposed to print both outputs from the child process in a new user namespace and outputs from the parent process.

# ./user_namespace
UID outside the namespace is 0
GID outside the namespace is 0
UID inside the namespace is 65534
GID inside the namespace is 65534

However, it only show parent outputs.

UID outside the namespace is 1000
GID outside the namespace is 1000

Please help to understand why the child process is not printing.

Code

#define _GNU_SOURCE
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sched.h>
#include <signal.h>

static int childFunc(void *arg)
{
    printf("UID inside the namespace is %ld\n", (long)geteuid());
    printf("GID inside the namespace is %ld\n", (long)getegid());
}

static char child_stack[1024*1024];

int main(int argc, char *argv[])
{
    pid_t child_pid;

    /* child_pid = clone(childFunc, child_stack + (1024*1024), CLONE_NEWUSER, 0);*/

    child_pid = clone(&childFunc, child_stack + (1024*1024), CLONE_NEWUSER, 0);

    printf("UID outside the namespace is %ld\n", (long)geteuid());
    printf("GID outside the namespace is %ld\n", (long)getegid());
    waitpid(child_pid, NULL, 0);
    exit(EXIT_SUCCESS);
}

Environment

$ uname -r
3.10.0-693.21.1.el7.x86_64

$ cat /etc/os-release 
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
CPE_NAME="cpe:/o:centos:centos:7"

References


Update

As per the answer from thejonny, it was to enable user namespace. For RHEL/CentOS 7, Is it safe to enable user namespaces in CentOS 7.4 and how to do it?

By default, the new 7.4 kernel restricts the number of user namespaces to 0. To work around this, increase the user namespace limit:
echo 15000 > /proc/sys/user/max_user_namespaces


Solution

  • Unprivileged user namespaces are probably disabled. As you don't check the return value of clone, you won't notice. Running through strace on my system prints:

    .... startup stuff ...
    clone(child_stack=0x55b41f2a4070, flags=CLONE_NEWUSER) = -1 EPERM (Operation not permitted)
    geteuid()                               = 1000
    fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 6), ...}) = 0
    brk(NULL)                               = 0x55b4200b8000
    brk(0x55b4200d9000)                     = 0x55b4200d9000
    write(1, "UID outside the namespace is 100"..., 34UID outside the namespace is 1000
    ) = 34
    getegid()                               = 1000
    write(1, "GID outside the namespace is 100"..., 34GID outside the namespace is 1000
    ) = 34
    wait4(-1, NULL, 0, NULL)                = -1 ECHILD (No child processes)
    exit_group(0)   = ?
    

    So clone and therefor waitpid fail, there is no child process.

    See here to enable user privileges: https://superuser.com/questions/1094597/enable-user-namespaces-in-debian-kernel