it looks like if we like to set up Kafka connector in distributed mode, we will need to have a unique hostname at CONNECT_REST_ADVERTISED_HOST_NAME. However, if we deploy the connector on AWS with an auto-scale group, there is no known hostname for that, not sure how can I do the setup?
You can achieve this scenario using the following steps in Pre-created EC2 instance.
rest.advertised.host.name
property connect-distributed.properties
filerest.advertised.host.name=hostname
kafka-connect.service
file
nano /etc/systemd/system/kafka-connect.service
[Unit]
Description=Kakfka-connect
After=network.target
[Service]
User=ubuntu
Group=ubuntu
Environmet="KAFKA_HEAP_OPTS=-Xmx4G -Xms2G"
Environment="KAFKA_OPTS=-javaagent:/home/ubuntu/prometheus/jmx_prometheus_javaagent-0.16.1.jar=8080:/home/ubuntu/prometheus/kafka-connect.yml"
ExecStart=/home/ubuntu/kafka/kafka_2.13-2.7.0/bin/connect-distributed.sh /home/ubuntu/config/connect-distributed.properties
[Install]
WantedBy=multi-user.target
3. Now take the snapshot of this EC2 instance's volume.
Use the snapshot as volume. and use following User data.
#!/bin/bash
apt-get update
apt-get -y upgrade
sed -i "s/hostname/$(hostname -I)/g" /home/ubuntu/config/connect-distributed.properties
systemctl start kafka-connect
systemctl enable kafka-connect
sed command will replace the 'hostname' string in connect-distributed.properties file for rest.advertised.host.name
with instance's private IP when new instance starts from Auto-scaling group.