Search code examples
dockerapache-kafkadocker-compose

Docker compose doesn't save data in volume


I am trying to run Kafka with docker-compose. I got this yml file:

version: '3'


services:
  zookeeper:
    image: ${REPOSITORY}/cp-zookeeper:${TAG}
    hostname: zookeeper
    container_name: zookeeper
    ports:
      - "2181:2181"
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000
    volumes:
      - ./zoo:/var/lib/zookeeper

  broker:
    image: ${REPOSITORY}/cp-kafka:${TAG}
    hostname: broker
    container_name: broker
    depends_on:
      - zookeeper
    ports:
      - "29092:29092"
      - "9092:9092"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
    volumes:
      - ./broker:/var/lib/kafka

I ran in the directory with docker-compose.yml file a command:

docker-compose up -d

After that folders ./broker and ./zoo appear in my directory. Inside they have a structure like inside the containers (./zoo/data, ./broker/data). But there are no files in the dirs.

I tried

docker-compose exec broker ls /var/lib/kafka/data

and I saw folders and files about default topics


Solution

  • This comes down to the interaction between volumes (as declared in the Dockerfile), and the volume that you are trying to mount as part of the Docker Compose.

    If you inspect each container's Dockerfile, you'll see that it has volumes declared, which you can also see from inspecting it. Here's what it looks like when using your configuration:

    ➜ docker inspect zookeeper|jq '.[].Mounts[] | .Type ,.Destination'
    "volume"
    "/etc/zookeeper/secrets"
    "bind"
    "/var/lib/zookeeper"
    "volume"
    "/var/lib/zookeeper/log"
    "volume"
    "/var/lib/zookeeper/data"
    

    You'll notice that there are two volumes (which are declared in the image itself, i.e. from the Dockerfile) against the specific data paths for ZK

    • /var/lib/zookeeper/log
    • /var/lib/zookeeper/data

    In addition, there is the bind mount from the Docker Compose:

    • /var/lib/zookeeper/

    These clash, which explains the problem you're seeing.

    A similar pattern exists for the broker.


    So in short, you need to mount a local host directory per specific volume in the image:

    ---
    version: '3'
    
    
    services:
      zookeeper:
        image: confluentinc/cp-zookeeper:5.4.1
        hostname: zookeeper
        container_name: zookeeper
        ports:
          - "2181:2181"
        environment:
          ZOOKEEPER_CLIENT_PORT: 2181
          ZOOKEEPER_TICK_TIME: 2000
        volumes: 
          - ./zoo/data:/var/lib/zookeeper/data
          - ./zoo/log:/var/lib/zookeeper/log
    
      broker:
        image: confluentinc/cp-kafka:5.4.1
        hostname: broker
        container_name: broker
        depends_on:
          - zookeeper
        ports:
          - "9092:9092"
        environment:
          KAFKA_BROKER_ID: 1
          KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
          KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
          KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092
          KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
          KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
        volumes: 
          - ./broker/data:/var/lib/kafka/data
    

    With this done, we can see there's no conflicts in the container paths:

    ➜ docker inspect zookeeper|jq '.[].Mounts '
    [
      {
        "Type": "bind",
        "Source": "/private/tmp/zoo/log",
        "Destination": "/var/lib/zookeeper/log",
        "Mode": "rw",
        "RW": true,
        "Propagation": "rprivate"
      },
      {
        "Type": "bind",
        "Source": "/private/tmp/zoo/data",
        "Destination": "/var/lib/zookeeper/data",
        "Mode": "rw",
        "RW": true,
        "Propagation": "rprivate"
      },
      {
        "Type": "volume",
        "Name": "6cbb584e0d9aa2f119869b264544f587909d9f417fc553a7bb2954dd28ecb8ea",
        "Source": "/var/lib/docker/volumes/6cbb584e0d9aa2f119869b264544f587909d9f417fc553a7bb2954dd28ecb8ea/_data",
        "Destination": "/etc/zookeeper/secrets",
        "Driver": "local",
        "Mode": "",
        "RW": true,
        "Propagation": ""
      }
    ]
    

    and data from the containers:

    ➜ docker exec zookeeper ls -l /var/lib/zookeeper/data /var/lib/zookeeper/log
    /var/lib/zookeeper/data:
    total 0
    drwxr-xr-x 3 root root 96 Apr  3 08:59 version-2
    
    /var/lib/zookeeper/log:
    total 0
    drwxr-xr-x 3 root root 96 Apr  3 08:59 version-2
    
    ➜ docker exec broker ls -l /var/lib/kafka/data
    total 16
    drwxr-xr-x 6 root root 192 Apr  3 08:59 __confluent.support.metrics-0
    -rw-r--r-- 1 root root   0 Apr  3 08:59 cleaner-offset-checkpoint
    -rw-r--r-- 1 root root   4 Apr  3 09:01 log-start-offset-checkpoint
    -rw-r--r-- 1 root root  88 Apr  3 08:59 meta.properties
    -rw-r--r-- 1 root root  36 Apr  3 09:01 recovery-point-offset-checkpoint
    -rw-r--r-- 1 root root  36 Apr  3 09:02 replication-offset-checkpoint
    -rw-r--r-- 1 root root   0 Apr  3 08:30 wibble
    

    is stored on the local host:

    ➜ ls -l broker/data zoo/data zoo/log
    broker/data:
    total 32
    drwxr-xr-x  6 rmoff  wheel  192  3 Apr 09:59 __confluent.support.metrics-0
    -rw-r--r--  1 rmoff  wheel    0  3 Apr 09:59 cleaner-offset-checkpoint
    -rw-r--r--  1 rmoff  wheel    4  3 Apr 10:00 log-start-offset-checkpoint
    -rw-r--r--  1 rmoff  wheel   88  3 Apr 09:59 meta.properties
    -rw-r--r--  1 rmoff  wheel   36  3 Apr 10:00 recovery-point-offset-checkpoint
    -rw-r--r--  1 rmoff  wheel   36  3 Apr 10:01 replication-offset-checkpoint
    -rw-r--r--  1 rmoff  wheel    0  3 Apr 09:30 wibble
    
    zoo/data:
    total 0
    drwxr-xr-x  3 rmoff  wheel  96  3 Apr 09:59 version-2
    
    zoo/log:
    total 0
    drwxr-xr-x  3 rmoff  wheel  96  3 Apr 09:59 version-2
    

    See also Data Volumes for Kafka and ZooKeeper