Search code examples
apache-kafkassh-tunnel

Consume from a Kafka Cluster through SSH Tunnel


We are trying to consume from a Kafka Cluster using the Java Client. The Cluster is a behind a Jump host and hence the only way to access is through a SSH Tunnel. But we are not able read because once the consumer fetches metadata it uses the original hosts to connect to brokers. Can this behaviour be overridden? Can we ask Kafka Client to not use the metadata?


Solution

  • Not as far as I know.

    The trick I used when I needed to do something similar was:

    1. setup a virtual interface for each Kafka broker
    2. open a tunnel to each broker so that broker n is bound to virtual interface n
    3. configure your /etc/hosts file so that the advertised hostname of broker n is resolved to the ip of the virtual interface n.

    Es.

    Kafka brokers:

    • broker1 (advertised as broker1.mykafkacluster)
    • broker2 (advertised as broker2.mykafkacluster)

    Virtual interfaces:

    • veth1 (192.168.1.1)
    • veth2 (192.168.1.2)

    Tunnels:

    • broker1: ssh -L 192.168.1.1:9092:broker1.mykafkacluster:9092 jumphost
    • broker2: ssh -L 192.168.1.2:9092:broker1.mykafkacluster:9092 jumphost

    /etc/hosts:

    • 192.168.1.1 broker1.mykafkacluster
    • 192.168.1.2 broker2.mykafkacluster

    If you configure your system like this you should be able reach all the brokers in your Kafka cluster.

    Note: if you configured your Kafka brokers to advertise an ip address instead of a hostname the procedure can still work but you need to configure the virtual interfaces with the same ip address that the broker advertises.