Kafka fails to start with below error:
Fatal error during KafkaServer startup. Prepare to shutdown
java.lang.IllegalArgumentException: Duplicate log directories found: /node5/kafka/data/logs-47, /node7/kafka/data/logs-47!
at kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$10$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:155)
at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:56)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Kafka 0.9.0.1 is deployed as part of Cloudera.
What does the issue mean?
Is there a workaround or solution to this problem? Couldn't find it.
I reached this error after restart the broker following underreplicated partitions issue in some topic partition.
There were below errors in the broker's log before the restart
java.lang.IllegalStateException: Compaction for partition [logs,47] cannot be aborted and paused since it is in LogCleaningPaused state.
at kafka.log.LogCleanerManager$$anonfun$abortAndPauseCleaning$1.apply$mcV$sp(LogCleanerManager.scala:149)
at kafka.log.LogCleanerManager$$anonfun$abortAndPauseCleaning$1.apply(LogCleanerManager.scala:140)
at kafka.log.LogCleanerManager$$anonfun$abortAndPauseCleaning$1.apply(LogCleanerManager.scala:140)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:231)
...
Cloudera Data Directories configuration:
/node/kafka/data
/node2/kafka/data
...
/node8/kafka/data
UPDATE
I've inspected duplicate directories contents and found that the newest directory seems to have empty log segments:
ls -l /node5/kafka/data/logs-47
-rw-r--r-- 1 kafka kafka 10485760 Mar 9 05:35 00000000000000000000.index
-rw-r--r-- 1 kafka kafka 0 Mar 8 13:12 00000000000000000000.log
While older folder is not:
ls /node7/kafka/data/logs-47
-rw-r--r-- 1 kafka kafka 10485760 Mar 9 05:35 00000000000000366115.index
-rw-r--r-- 1 kafka kafka 0 Nov 25 10:13 00000000000000366115.log
Your error is saying that you have more than one directory on a single broker that contains data from a topic named logs
with at least 47 partitions.
The LogCleaner cannot continue to delete that data until it knows which one is the correct directory
If you have some idea about what should be in the topic, you can dump the log segments and inspect messages
If you don't know what data should be there, and the partitions are replicated to other brokers that are healthy, then delete all of the faulty partition directories and restart the broker, letting replication heal the missing data