Search code examples
pythondockerdocker-composepidmultiple-processes

Why is the parent pid of a Python process sometimes 0 in Docker containers with docker-compose?


When running a python shell directly from a docker-compose run, the parent PID shows up as 0, which feels very wrong. I've put together a very simple, reproducible case below:

# Dockerfile
FROM python:3.7-buster
COPY . /code/
WORKDIR /code
# docker-compose.yml
version: '3'

services:
  thing:
    build: .
    volumes:
      - .:/code

When I run a python shell from within this, its ppid is 0; likewise to any python code running this way (if running tests with pytest for example):

$ docker-compose run thing python
>>> import os
>>> os.getpid()
1
>>> os.getppid()
0

When I run a python shell from within a bash shell, I see a more sane value...

$ docker-compose run thing bash
root@<id> # python
>>> import os
>>> os.getpid()
6
>>> os.getppid()
1

When I run the python shell straight on my host machine, I also see more sane PID values...

$ python
>>> import os
>>> os.getpid()
25552
>>> os.getppid()
1133

I'm sure this is some strange behavior on how docker treats processes in a running container, but it doesn't seem to me like a PID should ever be 0. Is this expected behavior, and if so is there a workaround for Python code running this way that relies on a parent PID?


Solution

  • This is neither strange nor wrong -- the parent PID is 0 because there is no parent process in a Docker container.

    When running something like docker-compose run thing python, the first process ever started in that container (or more precisely, in that PID namespace) will be the python process itself. Therefore, it will get PID 1 and it will not have a parent process.

    Note: The exact same thing also happens on regular (non-containerized) Linux systems; the process with PID 1 is also the first process (started by the kernel after booting, in that case) and is usually an init system like systemd. The init system then handles the user-space part of booting your Linux system (like setting up your network, mounting file systems and starting system services) -- in a Docker container, there's typically no need for any of this, which also eliminates the need for any init system. However, there are init systems like dumb-init which are specifically made for running in containers.

    When running a shell in a container, and starting Python from that shell, the shell will be PID 1. In that case, running os.getppid() from your Python script should return 1.

    Is this expected behavior, and if so is there a workaround for Python code running this way that relies on a parent PID?

    As mentioned, you could start your Python environment from a shell (or any other process, for that matter) that would then get PID 1 instead of python. You could also use an init system designed for containers like dumb-init.