Search code examples
javakubernetesapache-kafkaapache-zookeeperstrimzi

Why Kafka doesn't start deployed on local k8s?


I have windows machine with installed docker + k8s(enabled from docker) instance For create kafka instance in k8s I chosen here

To deploy kafka used this commands:

kubectl create namespace kafka
kubectl create -f 'https://strimzi.io/install/latest?namespace=kafka' -n kafka
kubectl apply -f https://strimzi.io/examples/latest/kafka/kafka-persistent-single.yaml -n kafka

And everything lunched successfully but When I restart notebook, kafka pod started with error (screen from lens)

enter image description here

When I opened logs, I saw zookeeper connection error When opened zookeeper pod logs, I saw error like this

2023-12-09 18:06:49,991 INFO Created server with tickTime 2000 ms minSessionTimeout 4000 ms maxSessionTimeout 40000 ms clientPortListenBacklog -1 datadir /var/lib/zookeeper/data/version-2 snapdir /var/lib/zookeeper/data/version-2 (org.apache.zookeeper.server.ZooKeeperServer) [QuorumPeer[myid=1](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
2023-12-09 18:06:49,991 ERROR Couldn't bind to my-cluster-zookeeper-0.my-cluster-zookeeper-nodes.kafka.svc/<unresolved>:2888 (org.apache.zookeeper.server.quorum.Leader) [QuorumPeer[myid=1](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
java.net.SocketException: Unresolved address
    at java.base/java.net.ServerSocket.bind(ServerSocket.java:380)
    at java.base/java.net.ServerSocket.bind(ServerSocket.java:342)
    at org.apache.zookeeper.server.quorum.Leader.createServerSocket(Leader.java:322)
    at org.apache.zookeeper.server.quorum.Leader.lambda$new$0(Leader.java:301)
    at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
    at java.base/java.util.concurrent.ConcurrentHashMap$KeySpliterator.forEachRemaining(ConcurrentHashMap.java:3573)
    at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
    at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
    at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
    at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)

I tried to reset k8s and docker to factory configs, tried to change resources of docker(increase memory space ) but the error is same

Updates:

list permissions enter image description here

dns logs enter image description here

it means coredns-5dd5756b68-qhp5q pod can't connect to 192.168.65.7:53

After restart k8s node I saw error in the same dns logs

[ERROR] plugin/errors: 2 5593748469660065637.885187837306804871. HINFO: read udp 10.1.0.27:42685->192.168.65.7:53: i/o timeout
[ERROR] plugin/errors: 2 5593748469660065637.885187837306804871. HINFO: read udp 10.1.0.27:44025->192.168.65.7:53: i/o timeout

Solution

  • My work around is to restrart node after pc start I used bat file like this

    @echo OFF
    echo start docker and k8s..
    timeout 20
    
    echo stop node k8s..
    kubectl cordon docker-desktop
    kubectl delete pod my-cluster-kafka-0 -n kafka
    kubectl drain docker-desktop --delete-emptydir-data  --ignore-daemonsets --delete-local-data --force
    timeout 20
    
    kubectl uncordon docker-desktop
    echo start k8s node..
    echo pod status
    kubectl get pods -n kafka
    timeout 60
    
    echo pod status
    kubectl get pods -n kafka
    timeout 60
    

    Then I launch it using gpedit.msc when start work on pc