Search code examples
apache-kafka-connectaws-auto-scaling

How to deploy Kafka connector on AWS with auto-scale group


it looks like if we like to set up Kafka connector in distributed mode, we will need to have a unique hostname at CONNECT_REST_ADVERTISED_HOST_NAME. However, if we deploy the connector on AWS with an auto-scale group, there is no known hostname for that, not sure how can I do the setup?


Solution

  • You can achieve this scenario using the following steps in Pre-created EC2 instance.

    1. Configure rest.advertised.host.name property connect-distributed.properties file

    rest.advertised.host.name=hostname

    2. Create kafka-connect.service file

    nano /etc/systemd/system/kafka-connect.service

    [Unit]
    Description=Kakfka-connect
    After=network.target
    
    [Service]
    User=ubuntu
    Group=ubuntu
    Environmet="KAFKA_HEAP_OPTS=-Xmx4G -Xms2G"
    Environment="KAFKA_OPTS=-javaagent:/home/ubuntu/prometheus/jmx_prometheus_javaagent-0.16.1.jar=8080:/home/ubuntu/prometheus/kafka-connect.yml"
    ExecStart=/home/ubuntu/kafka/kafka_2.13-2.7.0/bin/connect-distributed.sh /home/ubuntu/config/connect-distributed.properties
    
    [Install]
    WantedBy=multi-user.target
    

    3. Now take the snapshot of this EC2 instance's volume.

    4. Create launch configuration

    Use the snapshot as volume. and use following User data.

    #!/bin/bash
    
    apt-get update
    apt-get -y upgrade
    sed -i "s/hostname/$(hostname -I)/g" /home/ubuntu/config/connect-distributed.properties
    systemctl start kafka-connect
    systemctl enable kafka-connect
    

    sed command will replace the 'hostname' string in connect-distributed.properties file for rest.advertised.host.name with instance's private IP when new instance starts from Auto-scaling group.