I have an issue with my drill cluster, my drill cluster have 1 master node and 4 nodes.
I have stopped each one of drill nodes
sudo -i /home/hadoop/apache-drill-1.10.0/bin/drillbit.sh stop
and started it
sudo -i /home/hadoop/apache-drill-1.10.0/bin/drillbit.sh start
I have no clue what to do, have tried searching online but nothing seems related beside this link which didnt solve my issue
2017-07-04 13:09:10,454 [main] INFO o.apache.drill.exec.server.Drillbit - Construction completed (2928 ms).
2017-07-04 13:09:10,864 [main] INFO o.a.drill.exec.rpc.user.UserServer - User Error Occurred: Drillbit could not bind to port 31010. (Address already in use)
org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: Drillbit could not bind to port 31010.
Server type UserServer
[Error Id: a75dd2ec-a3b6-4fcb-b60b-0a1a63354943 ]
................
................
org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) ~[drill-common-1.10.0.jar:1.10.0]
at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:159) [drill-java-exec-1.10.0.jar:1.10.0]
at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:294) [drill-java-exec-1.10.0.jar:1.10.0]
at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:271) [drill-java-exec-1.10.0.jar:1.10.0]
at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:267) [drill-java-exec-1.10.0.jar:1.10.0]
2017-07-04 13:09:11,919 [main] INFO o.apache.drill.exec.server.Drillbit - Shutdown completed (1053 ms).
Would love to get input for how to solve it
how can i find the reason for this: Drillbit could not bind to port 31010. (Address already in use
It appeared that i had a zombie task on my ec2 instance locking port 31010
so i killed that pid task by this:
netstat -tulp | grep LIST | grep 31010
return
tcp 0 0 *:31010 *:* LISTEN 16449/java
and now for the finisher
kill -9 16449