Search code examples
apache-kafkabroker

Appropriate settings to register broker/port information for Kafka Cluster


I read the following from confluence wiki for kafka and I am quoting it below:

Why do I see error "Should not set log end offset on partition" in the broker log?

Typically, you will see errors like the following.

kafka.common.KafkaException: Should not set log end offset on partition [test,22]'s local replica 4 ERROR [ReplicaFetcherThread-0-6], Error for partition [test,22] to broker 6:class kafka.common.UnknownException(kafka.server.ReplicaFetcherThread)

A common problem is that more than one broker registered the same host/port in Zookeeper. As a result, the replica fetcher is confused when fetching data from the leader. To verify that, you can use a Zookeeper client shell to list the registration info of each broker. The Zookeeper path and the format of the broker registration is described in Kafka data structures in Zookeeper. You want to make sure that all the registered brokers have unique host/port.

According to the official documentation, if I do PLAINTEXT://:9092 then all interfaces will register using 9092 port. 0.0.0.0 means default interface will register using 9092 port.

If this is true, then I don't see how 0.0.0.0:9092 broker registration can never confuse zookeeper? I think if I don't explicitly specify the hostname or ipaddr with portname, Zookeeper will always get confuse since all brokers will register with same interface and port number. I have confirmed that using Zookeeper-shell.bat and running command get /broker/ids/{id} command.

The following is from Zookeeper Client Shell enquiry on /brokers/ids

get /brokers/ids/1
{"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT"},"endpoints":["PLAINTEXT://0.0.0.0:9092"],"jmx_port":-1,"host":"0.0.0.0","timestamp":"1500646657734","port":9092,"version":4}
cZxid = 0xe0000000f
ctime = Fri Jul 21 14:17:37 UTC 2017
mZxid = 0xe0000000f
mtime = Fri Jul 21 14:17:37 UTC 2017
pZxid = 0xe0000000f
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x15d6582c70b0001
dataLength = 184
numChildren = 0
get /brokers/ids/2
{"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT"},"endpoints":["PLAINTEXT://0.0.0.0:9092"],"jmx_port":-1,"host":"0.0.0.0","timestamp":"1500646657006","port":9092,"version":4}
cZxid = 0xe0000000b
ctime = Fri Jul 21 14:17:37 UTC 2017
mZxid = 0xe0000000b
mtime = Fri Jul 21 14:17:37 UTC 2017
pZxid = 0xe0000000b
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x15d6582c70b0000
dataLength = 184
numChildren = 0
get /brokers/ids/3
{"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT"},"endpoints":["PLAINTEXT://0.0.0.0:9092"],"jmx_port":-1,"host":"0.0.0.0","timestamp":"1500646656895","port":9092,"version":4}
cZxid = 0xe00000008
ctime = Fri Jul 21 14:17:36 UTC 2017
mZxid = 0xe00000008
mtime = Fri Jul 21 14:17:36 UTC 2017
pZxid = 0xe00000008
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x35d6582c7800000
dataLength = 184
numChildren = 0

Has anyone got a better idea?


Solution

  • In kafka server.properties , there are two property keys:

    listeners

    The address the socket server listens on. It will get the value returned from java.net.InetAddress.getCanonicalHostName() if not configured.
    FORMAT:
    listeners = listener_name://host_name:port
    EXAMPLE:
    listeners = PLAINTEXT://your.host.name:9092

    advertised.listeners

    Hostname and port the broker will advertise to producers and consumers. If not set, it uses the value for "listeners" if configured. Otherwise, it will use the value returned from java.net.InetAddress.getCanonicalHostName().

    OK. Pay attention to the details for advertised.listeners. if you don't configure this property, it will use the listeners default. when you set listeners to 0.0.0.0:9092, It will listen all net interface of your Kafka server. But if the advertised.listeners also set to 0.0.0.0, then others will not know how to connect to your Kafka server, Consumer, Producer and Zookeeper. all of these will fail to find where is your Kafka server.

    So in a word, The advertised.listeners should be set your public net ip which other machine in Internet can connnect to your server with this ip.