Search code examples
apache-kafkaclouderacloudera-cdhkafka-python

How to reach Cloudera Kafka Broker on private network from outside?


I have a cluster inside a VPN which contains a server with private IP. I'm trying to set up a Kafka communication between an external server to my private server. My approach is to set an IP table where a public IP is pointing my private IP. Also, I opened the port 9092 and 9093 to make it reachable from outside. Now I am available to connect successfully to my server with the public IP from the external server.

telnet <public_ip> 9092
Connected to <public_ip>

My kafka broker is under a cloudera cluster and I created it with Cloudera Manager. The configuration is the following:

kafka.properties:

listeners=PLAINTEXT://<private_ip>:9092,SSL://<private_ip>:9093
advertised.listeners=PLAINTEXT://<private_ip>:9092,SSL://<private_ip>:9093

advertised.host.name:

<public_ip>

Using this broker configuration the comunication works perfectly inside the cluster either using the public_ip or private_ip of the kafka broker host.

What I see now is that I have a working broker that can be used with a public_ip and a external server that is able to reach the public_ip and it's required ports. But when I try to connect to the broker from a external server, I have the following error:

NO BROKERS AVAILABLE

There's no more information of the error. On my external server I have the kafka python package where I configure the producer as:

"bootstrap_servers": ["<publi_ip>:9092"]

on a existing TOPIC of my kafka broker.

Especifications:

private host

cloudera: CDH 5.12.0

kafka: kafka 2.2.0-1.2.2.0

zookeeper: Zookeeper 3.4.5

external host

kafka Python package: kafka-python==1.4.2

The problem is very similar to this post. But in this case he uses a forwarded port with public ip. Is any possibility to do it with ip tables? Anyone has managed to do it on a cloudera cluster?

Thank you in advance.


Solution

  • The question isn't specific to Cloudera or Python. And I don't think Cloudera Manager has some setting that'll set this up for you.

    advertised.listeners will have to be a publicly resolvable address that can be used to access each broker individually by clients (e.g two brokers cannot have the same listener setting and be used from a port forward from the public address to the internal address)

    Your setup is very similar to Kafka running in Docker or Cloud providers such as AWS, in that you're interacting over two networks, so refer to this blog for more information

    Also, unless you setup some other firewall settings to prevent random access, don't expose brokers in the plaintext protocol