I have two running container for flume and hadoop. Let it be hadoop2 and flume2. I created these two containers from two images namely hadoop_alone and flume_alone.
docker run -d -p 10.236.173.XX:8020:8020 -p 10.236.173.XX:50030:50030 -p 10.236.173.XX:50060:50060 -p 10.236.173.XX:50070:50070 -p 10.236.173.XX:50075:50075 -p 10.236.173.XX:50090:50090 -p 10.236.173.XX:50105:50105 --name hadoopservices hadoop_alone
I get into hadoop container and checked for exposed ports. So All the ports are exposed properly.
docker run -d --name flumeservices -p 0.0.0.0:5140:5140 -p 0.0.0.0:44444:44444 --link hadoopservices:hadoopservices flume_alone
I get into flume container and checked for env
and etc/hosts
entries. There is an entry for hadoopservices and env
variables are created automatically.
My core-site.xml
fs.defaultFS
hdfs://0.0.0.0:8020
I modified it so it ll accept services at 8020 from all the containers.
My source and sink in flume.conf
a2.sources.r1.type = netcat
a2.sources.r1.bind = localhost
a2.sources.r1.port = 5140
a2.sinks.k1.type = hdfs
a2.sinks.k1.hdfs.fileType = DataStream
a2.sinks.k1.hdfs.writeFormat = Text
a2.sinks.k1.hdfs.path = hdfs://hadoopservices:8020/user/root/syslog/%y-%m-%d/%H%M/%S
a2.sinks.k1.hdfs.filePrefix = events
a2.sinks.k1.hdfs.roundUnit = minute
a2.sinks.k1.hdfs.useLocalTimeStamp = true
I restarted hadoop namenode after changing core-site.xml
.
I try to write into hdfs from flume using
/usr/bin/flume-ng agent --conf-file /etc/flume-ng/conf/flume.conf --name a2 -Dflume.root.logger=INFO,console
It says
INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.net.ConnectException: Connection refused
So i found something is the problem with connection established between these two contianers. I get into hadoop container and checked for port connections
netstat -tna
tcp 0 0 127.0.0.1:52521 127.0.0.1:8020 TIME_WAIT
tcp 0 0 127.0.0.1:8020 127.0.0.1:52516 ESTABLISHED
tcp 0 0 127.0.0.1:52516 127.0.0.1:8020 ESTABLISHED
But i expect it to be
tcp 0 0 172.17.1.XX:54342 172.17.1.XX:8020 TIME_WAIT
tcp 0 0 172.17.1.XX:54332 172.17.1.XX:8020 ESTABLISHED
tcp 0 0 172.17.1.XX:8020 172.17.1.XX:54332 ESTABLISHED
Where 172.17.1.XX is the ip of my hadoop container.
I found the cause. Is it the reason?
Which configuration should be modified? And or my run statement? What should be changed to establish connection between these two docker containers so that i can able to write into hdfs from flume.
If you need more info, i 'll edit it further.
Please tell me some ideas.
If anybody face the same problem, please do the following steps.
1) Check whether 0.0.0.0:8020 is updated in core-site.xml
2) If you update it inside running container, **I suggest you all to restart ALL the services NOT ONLY namenode**. [better do as part of Dockerfile]
3) Check for `env` and `etc/hosts` contents in flume container
4) And hostname in `etc/hosts` must be matched with the `hdfs path` parameter in flume.conf
5) Get into hadoop container and do `netstat -tna` and you must see connection established to <hadoop_container_ip>:8020. Not to your localhost[127.0.0.1].
I hope it 'll be helpful to the people who tries to link containers and port mapping.