Basic setting:three VM are 192.168.23.165,192.168.23.166 and 192.168.23.172,I run zookeeper with standlone's mode in 165's machine and run storm in three machines. three machine fireware are all closed.zookeeper and storm version are 3.4.14 and 1.2.3 respectively.
My operations: First,I started zookeeper in 165's machine. Second,I started storm nimbus in 165's machine ,started storm supervisor in 166's and 172's machies. Third,storm topology is submitted in 165's machine.
Question1:It can submitted topology successfully,but 166's and 172's machine are not created worker process when I use jps -l command to check. I checked the 166's supervisor.log as same as 172's machine.
Qusetion2:When I use jps -l command in running supervisor's machines one more times, supervisor process will stop with no reason.
supervisor.log
2019-09-29 16:55:41.076 o.a.s.u.NimbusClient Async Localizer [WARN] Ignoring exception while trying to get leader nimbus info from 192.168.23.165. will retry with a different seed host.
java.lang.RuntimeException: java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: 拒绝连接 (Connection refused)
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:112) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.ThriftClient.<init>(ThriftClient.java:73) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.NimbusClient.<init>(NimbusClient.java:136) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:103) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:66) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:58) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.Utils.getClientBlobStoreForSupervisor(Utils.java:540) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.downloadBaseBlobs(AsyncLocalizer.java:121) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:148) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:101) [storm-core-1.2.3.jar:1.2.3]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_111]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111]
Caused by: java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: 拒绝连接 (Connection refused)
at org.apache.storm.security.auth.TBackoffConnect.retryNext(TBackoffConnect.java:64) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:56) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:104) ~[storm-core-1.2.3.jar:1.2.3]
... 14 more
Caused by: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: 拒绝连接 (Connection refused)
at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:226) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.thrift.transport.TFramedTransport.open(TFramedTransport.java:82) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:105) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:104) ~[storm-core-1.2.3.jar:1.2.3]
... 14 more
Caused by: java.net.ConnectException: 拒绝连接 (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_111]
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_111]
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_111]
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_111]
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_111]
at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_111]
at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:221) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.thrift.transport.TFramedTransport.open(TFramedTransport.java:82) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:105) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:104) ~[storm-core-1.2.3.jar:1.2.3]
... 14 more
2019-09-29 16:55:41.083 o.a.s.l.AsyncLocalizer Async Localizer [WARN] Failed to download basic resources for topology-id RandomStringTopologyLocal-1-1569747324
2019-09-29 16:55:41.083 o.a.s.d.s.AdvancedFSOps Async Localizer [INFO] Deleting path /opt/storm/data/supervisor/tmp/37b4a240-736b-40e8-a3a7-e3933fc2105c
2019-09-29 16:55:41.085 o.a.s.d.s.AdvancedFSOps Async Localizer [INFO] Deleting path /opt/storm/data/supervisor/stormdist/RandomStringTopologyLocal-1-1569747324
2019-09-29 16:55:41.086 o.a.s.l.AsyncLocalizer Async Localizer [WARN] Caught Exception While Downloading (rethrowing)...
org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader nimbus from seed hosts [192.168.23.165]. Did you specify a valid list of nimbus hosts for config nimbus.seeds?
at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:120) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:66) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:58) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.Utils.getClientBlobStoreForSupervisor(Utils.java:540) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.downloadBaseBlobs(AsyncLocalizer.java:121) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:148) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:101) [storm-core-1.2.3.jar:1.2.3]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_111]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111]
Specially,when I run supervisor in 165's machine,it means supervisor and nimbus are run in same machine that is run zookeeper too.I submit topology again,it can create worker process,everything is ok.
Zookeeper's configuration is following:
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/opt/zookeeper/data
logDir=/opt/zookeeper/log
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
Storm's configuration is following:
########### These MUST be filled in for a storm configuration
storm.zookeeper.servers:
- "192.168.23.165"
# - "server1"
# - "server2"
#
nimbus.seeds: ["192.168.23.165"]
#
storm.local.dir: "/opt/storm/data"
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
This may be asking the obvious, but did you check that your 166/167 machines can connect to 192.168.23.165 on port 6627?