I am using Horton Works Cluster (2 Node cluster) to run the spark and flume , So when I am running the job with --master "local[*]" , Flume is able to send the events and Spark is also able to receive and on checking at localhost:4040 I can see the events are being received from the flume. (We are pumping 100 Events/Sec from flume using flume-ng-sql source with an approx size of ~1KB each)
Where as when I run the same example with --master "yarn-client" , I am getting the below error in flume and spark is not getting any events as well.
2015-08-13 18:24:24,927 (SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:160)] Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: Failed to send events
at org.apache.flume.sink.AbstractRpcSink.process(AbstractRpcSink.java:403)
at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.flume.FlumeException: NettyAvroRpcClient { host: localhost, port: 55555 }: RPC connection error
at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:182)
at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:121)
at org.apache.flume.api.NettyAvroRpcClient.configure(NettyAvroRpcClient.java:638)
at org.apache.flume.api.RpcClientFactory.getInstance(RpcClientFactory.java:88)
at org.apache.flume.sink.AvroSink.initializeRpcClient(AvroSink.java:127)
at org.apache.flume.sink.AbstractRpcSink.createConnection(AbstractRpcSink.java:222)
at org.apache.flume.sink.AbstractRpcSink.verifyConnection(AbstractRpcSink.java:283)
at org.apache.flume.sink.AbstractRpcSink.process(AbstractRpcSink.java:360)
... 3 more
Caused by: java.io.IOException: Error connecting to localhost/
at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:261)
at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:203)
at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:152)
at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:168)
... 10 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:496)
at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:452)
at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:365)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
... 1 more
Also below observation has been observed in cluster: -- Memory consumption using yarn is pretty much higher than compared to that being used in case of local.
-- Also when I am pumping 100 events per 30 second then Flume and spark are able to connect and process the same using yarn-client as well as local..
Below is the command which I am using for flume and spark.
sudo -u hdfs flume-ng agent --conf conf/ -f conf/flume_mysql_spark.conf -n agent1 -Dflume.root.logger=INFO,console > flumelog.txt
sudo -u hdfs spark-submit --master "yarn-client" --class "org.paladion.atm.FlumeEventCount" target/atm-1.1-jar-with-dependencies.jar > sparklog.txt
sudo -u hdfs spark-submit --master "local[*]" --class "org.paladion.atm.FlumeEventCount" target/atm-1.1-jar-with-dependencies.jar > sparklog.txt
Kindly l;et me know what could be wrong over here?
It got solves as below:
1 - If running as local give IP of local machine in Flume as well as spark.
2 - If running as cluster (yarn-client or yarn-cluster) give IP of the machine in cluster where you want to send the events (other than the one where you are executing the program so may be give IP of node which is not a master node) machine in Flume as well as spark.
Let me know if I am wrong and this could have worked for some other reason and any better solution is there for the same.