Search code examples
dockerapache-kafkadebeziumdebezium-engine

Can there be 2 Debezium connector running in one Kafka cluster?


enter image description here From Debezium official page, there's this picture showing that multiple Debezium connector can connect to the same Kafka.

So I have 2 databases, 2 Debeziums, 1 Kafka running in docker-compose, but it seems like only 1 debezium sent update to kafka (watch from kafdrop).

Here's my docker-compose file:

version: '3.6'
services:

  hero_db:
    image: postgres:14
    restart: always
    environment:
      POSTGRES_PASSWORD: postgrespassword
    ports:
      - '5432:5432'
    expose:
      - '5432'
    command: [ "postgres", "-c", "wal_level=logical" ]
    volumes:
      - hero_db_data:/var/lib/postgresql/data

  villian_db:
    image: postgres:14
    restart: always
    environment:
      POSTGRES_PASSWORD: postgrespassword
    ports:
      - '2345:2345'
    expose:
      - '2345'
    command: [ "postgres", "-c", "wal_level=logical" ]
    volumes:
      - villian_db_data:/var/lib/postgresql/data

  zookeeper:
    image: confluentinc/cp-zookeeper:latest
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000
    ports:
      - 22181:2181

  kafka:
    image: confluentinc/cp-kafka:5.3.1
    ports:
      - 29092:29092
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
    depends_on:
      - zookeeper

  kafdrop:
    image: obsidiandynamics/kafdrop
    container_name: kafdrop
    ports:
      - "9000:9000"
    environment:
      KAFKA_BROKERCONNECT: "kafka:9092"
      JVM_OPTS: "-Xms16M -Xmx48M -Xss180K -XX:-TieredCompilation -XX:+UseStringDeduplication -noverify"
    depends_on:
      - kafka

  hero_debezium:
    image: debezium/connect:1.9
    ports:
      - 8083:8083
    expose:
      - '8083'
    environment:
      CONFIG_STORAGE_TOPIC: hero_configs
      OFFSET_STORAGE_TOPIC: hero_offsets
      STATUS_STORAGE_TOPIC: hero_statuses
      BOOTSTRAP_SERVERS: kafka:9092
    depends_on: [ zookeeper, kafka, hero_db ]

  villian_debezium:
    image: debezium/connect:1.9
    ports:
      - 8084:8083
    expose:
      - '8084'
    environment:
      CONFIG_STORAGE_TOPIC: villian_configs
      OFFSET_STORAGE_TOPIC: villian_offsets
      STATUS_STORAGE_TOPIC: villian_statuses
      BOOTSTRAP_SERVERS: kafka:9092
    depends_on: [ zookeeper, kafka, villian_db ]

volumes:
  hero_db_data:
  villian_db_data:

Here's a debezium config file in json of hero_dbz and villian_dbz:
hero_dbz.json

{
    "name": "hero-postgresql-connector",
    "config": {
        "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
        "plugin.name": "pgoutput",
        "database.hostname": "hero_db",
        "database.port": "5432",
        "database.user": "postgres",
        "database.password": "postgrespassword",
        "database.dbname": "postgres",
        "database.server.name": "hero_server",
        "table.include.list": "public.heroes",
        "table.whitelist": "public.heroes",
        "topic.prefix": "topic_heroes"
    }
}

villian_dbz.json

{
    "name": "villian-postgresql-connector",
    "config": {
        "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
        "plugin.name": "pgoutput",
        "database.hostname": "villian_db",
        "database.port": "2345",
        "database.user": "postgres",
        "database.password": "postgrespassword",
        "database.dbname": "postgres",
        "database.server.name": "villian_server",
        "table.include.list": "public.villians",
        "table.whitelist": "public.villians",
        "topic.prefix": "topic_villian"
    }
}

I config both hero_dbz & villian_dbz with these command:
curl -i -X POST -H "Accept:application/json" -H "Content-Type:application/json" 127.0.0.1:8083/connectors/ --data "@hero_dbz.json"
curl -i -X POST -H "Accept:application/json" -H "Content-Type:application/json" 127.0.0.1:8084/connectors/ --data "@villian_dbz.json"

Here's a screen from Kafdrop showing only data from hero_db (hero_server.public.heroes) but nothing from villian_db.

enter image description here


Solution

  • Shouldn't be a problem running multiple kafka-connect servers. There might an issue in your configuration and setup. Perhaps look into to logs of the villian connector.

    your internal port of villan db is wrong IMHO:

    villian_db:
        image: postgres:14
        restart: always
        environment:
          POSTGRES_PASSWORD: postgrespassword
        ports:
          - '2345:5432'
        expose:
          - '2345'
        command: [ "postgres", "-c", "wal_level=logical" ]
        volumes:
          - villian_db_data:/var/lib/postgresql/data
    

    updates:

    please add different GROUP_IDs env to the connect clusters. background:

    This environment variable is required when running the Kafka Connect service. Set this to an ID that uniquely identifies the Kafka Connect cluster the service and its workers belong to.

    Something like GROUP_ID: 3 and GROUP_ID: 2 for the other. Your json should point to 5432, internal port of PostgresDB, so the .json files should point to 5432 (both)

    And Your Zookeeper is wrongly configures, please fix it like that:

    ....
    ports:
          - 2181:2181