Search code examples
dockerunix-socketxvfbxserver

Cannot bind X UNIX socket within Docker container


I am trying to run Xvfb within a Docker container, but it is failing with the error:

$ Xvfb :0 -nolisten tcp -screen 0 1024x768x24
_XSERVTransSocketUNIXCreateListener: ...SocketCreateListener() failed
_XSERVTransMakeAllCOTSServerListeners: server already running
(EE) 
Fatal server error:
(EE) Cannot establish any listening sockets - Make sure an X server isn't already running(EE) 

According to strace, it's trying to bind a UNIX socket at /tmp/.X11-unix/X0:

bind(4, {sa_family=AF_UNIX, sun_path=@"/tmp/.X11-unix/X0"}, 20) = -1 EADDRINUSE (Address already in use)

However, the file does not already exist in the container (verified by ls -l /tmp/.X11-unix).

If I use a different screen number, like :1, the program succeeds.

This container is running in --network host mode, and /tmp/.X11-unix/X0 does exist on the host. Does this create some kind of issue where the container and the host cannot have a UNIX socket at the same path, even if that socket is not visible in the container?


Solution

  • This is an exciting error! You're running into a naming collision in the UNIX domain socket abstract namespace. UNIX domain sockets in Linux can be created in two kinds of places: One, in the normal VFS, which can be inspected with ls; the other is the "abstract namespace", where socket names start with '\0' (yea, a literal null byte) and get cleaned up when the program creating the socket exits.

    Now, your strace call shows that it attempts to listen on a socket name in the abstract namespace by showing the sun_path field's value starting with an @. This is not exactly clear, but that's how it goes...

    Here's how it shows a "normal", VFS-backed socket:

    :;    strace -e bind socat UNIX-LISTEN:/tmp/asf-testing-hi STDOUT
    bind(5, {sa_family=AF_UNIX, sun_path="/tmp/asf-testing-hi"}, 21) = 0
    

    And here's a socket in the abstract namespace:

    :;    strace -e bind socat ABSTRACT-LISTEN:/tmp/asf-testing-hi STDOUT
    bind(5, {sa_family=AF_UNIX, sun_path=@"/tmp/asf-testing-hi"}, 22) = 0
    

    Welp, there's that @ sign.

    Back to the original problem, it looks like abstract unix domain socket namespaces are scoped to a network namespace, which means that using "host" mode is guaranteed to result in collisions. What you probably have to do is set up a separate network namespace for that container, and create a bridge for every network interface on the host that you want the container to have available. I believe https://stackunderflow.dev/p/network-namespaces-and-docker/ might have a decent tutorial for doing this manually. Hope this helps!