I am trying to run Kafka with mounted NFS Volume, facing exception and can not start Kafka:
[2020-03-15 09:36:11,580] ERROR There was an error in one of the threads during logs loading: org.apache.kafka.common.KafkaException: Found directory /var/lib/kafka/data/.snapshot, '.snapshot' is not in the form of topic-partition or topic-partition.uniqueId-delete (if marked for deletion).
Kafka's log directories (and children) should only contain Kafka topic data. (kafka.log.LogManager)
[2020-03-15 09:36:11,582] ERROR [KafkaServer id=1] Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
org.apache.kafka.common.KafkaException: Found directory /var/lib/kafka/data/.snapshot, '.snapshot' is not in the form of topic-partition or topic-partition.uniqueId-delete (if marked for deletion).
Kafka's log directories (and children) should only contain Kafka topic data.
at kafka.log.Log$.exception$1(Log.scala:2150)
at kafka.log.Log$.parseTopicPartitionName(Log.scala:2157)
at kafka.log.LogManager.kafka$log$LogManager$$loadLog(LogManager.scala:260)
at kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$11$$anonfun$apply$15$$anonfun$apply$2.apply$mcV$sp(LogManager.scala:345)
at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:63)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
This is my docker-compose scripts:
zookeeper:
image: confluentinc/cp-zookeeper:5.3.2
environment:
ZOOKEEPER_CLIENT_PORT: 2181
volumes:
- zk-data:/var/lib/zookeeper/data:nocopy
- zk-log:/var/lib/zookeeper/log:nocopy
kafka:
image: confluentinc/cp-kafka:5.3.2
environment:
KAFKA_ADVERTISED_HOST_NAME: kafka
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
volumes:
- kf-data:/var/lib/kafka/data:nocopy
volumes:
zk-data:
driver: local
driver_opts:
type: "nfs"
o: addr=18.0.3.227 #IP of NFS
device: ":/opt/data/zk-data"
zk-log:
driver: local
driver_opts:
type: "nfs"
o: addr=18.0.3.227
device: ":/opt/data/zk-log"
kf-data:
driver: local
driver_opts:
type: "nfs"
o: addr=18.0.3.227
device: ":/opt/data/kf-data"
If I go to my NFS server,
ls -la /opt/data/kf-data/.snapshot
total 80
drwxrwxrwx 33 root root 12288 Mar 28 00:10 .
drwx------ 2 root domain^users 4096 Feb 21 19:20 ..
drwx------ 2 root domain^users 4096 Feb 13 11:06 daily.2020-02-14_0010
drwx------ 2 root domain^users 4096 Feb 13 11:06 daily.2020-02-15_0010
drwx------ 2 root domain^users 4096 Feb 13 11:06 daily.2020-02-16_0010
drwx------ 2 root domain^users 4096 Feb 13 11:06 daily.2020-02-17_0010
drwx------ 2 root domain^users 4096 Feb 21 19:20 snapmirror.ka938443-8ea1-22e8-6608-00a067d1a20a_2148891236.2020-02-27_180700
There is a hidden folder named .snapshot, this folder is generated by NFS automatically and can not be removed. This is the reason why Kafka complains: Found directory /var/lib/kafka/data/.snapshot, '.snapshot' is not in the form of topic-partition or topic-partition.uniqueId-delete (if marked for deletion).
And this could be the general Kafka problem, is there any special configure or solution to let Kafka use the external NFS volume?
Any ideas will be grateful!
If you are using NetApp as NFS platform, this info could help: disable .snapshot access in NetApp is a global vFilter function, which is not a function per folder or share.
If you can not turn off the access to .snapshot, there is no solution, unless you use other NFS platforms, which will not generate .snapshot folder in every folder.