Search code examples
apache-kafkadocker-composeconfluent-platform

Single Confluent Kafka cluster in different host machine


I want to mount a Confluent Kafka Cluster in Kraft mode, in which, each broker is containerized in a Docker container. Each of this broker should be placed on different host machine. I want that two of there are both controller and broker, and one of them only broker. To be more clear:

  1. broker_1 ---> HOST_MACHINE_1 with IP IP_1 [controller,broker]
  2. broker_2 ---> HOST_MACHINE_2 with IP IP_2 [broker]
  3. broker_3 ---> HOST_MACHINE_3 with IP IP_3 [controller,broker]

As first trial I started to write the docker-compose file of the first broker, but I have doubt on how setup the inter-broker communication.

version: '3.8'
volumes:
  kafka-data:
    driver: local

services:
  broker:
    image: confluentinc/cp-kafka:latest
    hostname: kafka1
    container_name: broker
    volumes:
      - kafka-data:/var/lib/kafka/data:Z
    ports:
      - "9092:9092"
      - "9101:9101"
    environment:
      KAFKA_NODE_ID: 1
      KAFKA_AUTO_CREATE_TOPICS_ENABLE: "false"
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: 'CONTROLLER:PLAINTEXT,KAFKA1:PLAINTEXT,EXTERNAL:PLAINTEXT'
      KAFKA_ADVERTISED_LISTENERS: 'KAFKA1://kafka1:29092,EXTERNAL://localhost:9092'
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
      KAFKA_JMX_PORT: 9101
      KAFKA_JMX_HOSTNAME: localhost
      KAFKA_PROCESS_ROLES: 'broker,controller'
      KAFKA_CONTROLLER_QUORUM_VOTERS: '1@kafka1:29093,3@IP_3:29093'
      KAFKA_LISTENERS: 'KAFKA1://kafka1:29092,CONTROLLER://broker:29093,EXTERNAL://0.0.0.0:9092'
      KAFKA_INTER_BROKER_LISTENER_NAME: 'KAFKA1'
      KAFKA_CONTROLLER_LISTENER_NAMES: 'CONTROLLER'
      KAFKA_LOG_DIRS: '/tmp/kraft-combined-logs'
      CLUSTER_ID: 'testId'

Should I have to add the HOST:PORT mapping on the KAFKA_LISTENERS (ie. KAFKA2://IP_2:9092,KAFKA3://IP_3:9092)? Should I declare the same CLUSTER_ID on the other two Compose files?

I am a little bit lost, I would be very thankful if someone explain me better the dynamics of these operation in a general way, so, not specifically bound to this use case. I put it to facilitate the explanation.


Solution

  • if run containers on different hosts each host must resolve neighbour names, network traffic between hosts must be allowed on used ports(9091,9092 and etc), no more localhost in config), in config will use real host names and ports or ip addresses. Example configs for 3 hosts. First host name is host1, second is host2 ... host3

    Config for host1:

    version: '3'
    services:
      kafka1:
        image: confluentinc/cp-kafka
        container_name: kafka1
        hostname: kafka1
        ports:
          - "9092:9092"
          - "29093:29093"
          - "29092:29092"
        environment:
          KAFKA_NODE_ID: 1
          KAFKA_CONTROLLER_LISTENER_NAMES: 'CONTROLLER'
          KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: 'CONTROLLER:PLAINTEXT,INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT'
          KAFKA_LISTENERS: 'INTERNAL://host1:29092,CONTROLLER://0.0.0.0:29093,EXTERNAL://0.0.0.0:9092'
          KAFKA_ADVERTISED_LISTENERS: 'INTERNAL://0.0.0.0:29092,EXTERNAL://0.0.0.0:9092'
          KAFKA_INTER_BROKER_LISTENER_NAME: 'INTERNAL'
          KAFKA_CONTROLLER_QUORUM_VOTERS: '1@host1:29093,2@host2:29093,3@host3:29093'
          KAFKA_PROCESS_ROLES: 'broker,controller'
          KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
          KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 3
          KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 3
          CLUSTER_ID: 'ciWo7IWazngRchmPES6q5A=='
          KAFKA_LOG_DIRS: '/tmp/kraft-combined-logs'
    

    Config for host2:

    version: '3'
    services:
      kafka2:
        image: confluentinc/cp-kafka
        container_name: kafka2
        hostname: kafka2
        ports:
          - "9092:9092"
          - "29093:29093"
          - "29092:29092"
        environment:
          KAFKA_NODE_ID: 2
          KAFKA_CONTROLLER_LISTENER_NAMES: 'CONTROLLER'
          KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: 'CONTROLLER:PLAINTEXT,INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT'
          KAFKA_LISTENERS: 'INTERNAL://host2:29092,CONTROLLER://0.0.0.0:29093,EXTERNAL://0.0.0.0:9092'
          KAFKA_ADVERTISED_LISTENERS: 'INTERNAL://0.0.0.0:29092,EXTERNAL://0.0.0.0:9092'
          KAFKA_INTER_BROKER_LISTENER_NAME: 'INTERNAL'
          KAFKA_CONTROLLER_QUORUM_VOTERS: '1@host1:29093,2@host2:29093,3@host3:29093'
          KAFKA_PROCESS_ROLES: 'broker,controller'
          KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
          KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 3
          KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 3
          CLUSTER_ID: 'ciWo7IWazngRchmPES6q5A=='
          KAFKA_LOG_DIRS: '/tmp/kraft-combined-logs'
    

    Config for host3:

      kafka3:
        image: confluentinc/cp-kafka
        container_name: kafka3
        hostname: kafka3
        ports:
          - "9092:9092"
          - "29093:29093"
          - "29092:29092"
        environment:
          KAFKA_NODE_ID: 3
          KAFKA_CONTROLLER_LISTENER_NAMES: 'CONTROLLER'
          KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: 'CONTROLLER:PLAINTEXT,INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT'
          KAFKA_LISTENERS: 'INTERNAL://host3:29092,CONTROLLER://0.0.0.0:29093,EXTERNAL://0.0.0.0:9092'
          KAFKA_ADVERTISED_LISTENERS: 'INTERNAL://0.0.0.0:29092,EXTERNAL://0.0.0.0:9092'
          KAFKA_INTER_BROKER_LISTENER_NAME: 'INTERNAL'
          KAFKA_CONTROLLER_QUORUM_VOTERS: '1@host1:29093,2@host2:29093,3@host3:29093'
          KAFKA_PROCESS_ROLES: 'broker,controller'
          KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
          KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 3
          KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 3
          CLUSTER_ID: 'ciWo7IWazngRchmPES6q5A=='
          KAFKA_LOG_DIRS: '/tmp/kraft-combined-logs'
    
      schema-registry:
        image: confluentinc/cp-schema-registry
        container_name: schema-registry
        hostname: schema-registry
        ports:
          - "8081:8081"
        environment:
          SCHEMA_REGISTRY_HOST_NAME: schema-registry
          SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: 'host1:29092,host2:29092,host3:29092'
          SCHEMA_REGISTRY_LISTENERS: 'http://0.0.0.0:8081'
        depends_on:
          - kafka3
    
    networks:
      default:
        name: my-network
        external: true
    

    Configs can be had mistakes) But idea is this. I hope it help.