Search code examples
dockerapache-zookeeperapache-kafkafig

How to set Zookeeper dataDir in Docker (fig.yml)


I've configured Zookeeper and Kafka containers in a fig.yml file for Docker. Both containers start fine. But after sending a number of messages, my application /zk-client hangs. On checking zookeeper logs, I see the error:

Error Path:/brokers Error:KeeperErrorCode = NoNode for /brokers

My fig.yml is as follows:

zookeeper:
  image: wurstmeister/zookeeper
  ports:
    - "2181:2181"
  environment:
    ZK_ADVERTISED_HOST_NAME: xx.xx.x.xxx
    ZK_CONNECTION_TIMEOUT_MS: 6000
    ZK_SYNC_TIME_MS: 2000
    ZK_DATADIR: /path/to/data/zk/data/dir
kafka:
  image: wurstmeister/kafka:0.8.2.0
  ports:
    - "xx.xx.x.xxx:9092:9092"
  links:
    - zookeeper:zk
  environment:
    KAFKA_ADVERTISED_HOST_NAME: xx.xx.x.xxx
    KAFKA_LOG_DIRS: /home/svc_cis4/dl
  volumes:
    - /var/run/docker.sock:/var/run/docker.sock

I've searched for quite a while now, but I haven't got a solution yet. I've also tried setting the data directory in fig.yml using ZK_DATADIR: '/path/to/zk/data/dir' but it doesn't seem to help. Any assistance will be appreciated.

UPDATE

Content of /opt/kafka_2.10-0.8.2.0/config/server.properties:

broker.id=0
port=9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
num.partitions=1
num.recovery.threads.per.data.dir=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
log.cleaner.enable=false
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=6000

Solution

  • The configuration that's been working for me without any issues for the last two days involves specifying host addresses for both Zookeeper and Kafka. My fig.yml content is:

    zookeeper:
      image: wurstmeister/zookeeper
      ports:
        - "xx.xx.x.xxx:2181:2181"
    kafka:
      image: wurstmeister/kafka:0.8.2.0
      ports:
        - "9092:9092"
      links:
        - zookeeper:zk
      environment:
         KAFKA_ADVERTISED_HOST_NAME: xx.xx.x.xxx
         KAFKA_NUM_REPLICA_FETCHERS: 4
         ...other env variables...
      volumes:
        - /var/run/docker.sock:/var/run/docker.sock
    validator:
      build: .
      volumes:
        - .:/host
      entrypoint: /bin/bash
      command: -c 'java -jar /host/app1.jar'
      links:
        - zookeeper:zk
        - kafka
    analytics:
      build: .
      volumes:
        - .:/host
      entrypoint: /bin/bash
      command: -c 'java -jar /host/app2.jar'
      links:
        - zookeeper:zk
        - kafka
    loader:
      build: .
      volumes:
        - .:/host
      entrypoint: /bin/bash
      command: -c 'java -jar /host/app3.jar'
      links:
        - zookeeper:zk
        - kafka
    

    And the accompanying Dockerfile content:

    FROM ubuntu:trusty
    
    MAINTAINER Wurstmeister
    
    RUN apt-get update; apt-get install -y unzip openjdk-7-jdk wget git docker.io
    
    RUN wget -q http://apache.mirrors.lucidnetworks.net/kafka/0.8.2.0/kafka_2.10-0.8.2.0.tgz -O /tmp/kafka_2.10-0.8.2.0.tgz
    RUN tar xfz /tmp/kafka_2.10-0.8.2.0.tgz -C /opt
    
    VOLUME ["/kafka"]
    
    ENV KAFKA_HOME /opt/kafka_2.10-0.8.2.0
    ADD start-kafka.sh /usr/bin/start-kafka.sh
    ADD broker-list.sh /usr/bin/broker-list.sh
    CMD start-kafka.sh