I would like to run Hadoop
and Flume
dockerized. I have a standard Hadoop
image with all the default values. I cannot see how can these services communicate each other placed in separated containers.
Flume
's Dockerfile
looks like this:
FROM ubuntu:14.04.4
RUN apt-get update && apt-get install -q -y --no-install-recommends wget
RUN mkdir /opt/java
RUN wget --no-check-certificate --header "Cookie: oraclelicense=accept-securebackup-cookie" -qO- \
https://download.oracle.com/otn-pub/java/jdk/8u20-b26/jre-8u20-linux-x64.tar.gz \
| tar zxvf - -C /opt/java --strip 1
RUN mkdir /opt/flume
RUN wget -qO- http://archive.apache.org/dist/flume/1.6.0/apache-flume-1.6.0-bin.tar.gz \
| tar zxvf - -C /opt/flume --strip 1
ADD flume.conf /var/tmp/flume.conf
ADD start-flume.sh /opt/flume/bin/start-flume
ENV JAVA_HOME /opt/java
ENV PATH /opt/flume/bin:/opt/java/bin:$PATH
CMD [ "start-flume" ]
EXPOSE 10000
You should link your containers. There are some variants how you can implement this.
1) Publish ports:
docker run -p 50070:50070 hadoop
option p
binds port 50070 of your docker container with port 50070 of host machine
2) Link containers (using docker-compose)
docker-compose.yml
version: '2'
services:
hadoop:
image: hadoop:2.6
flume:
image: flume:last
links:
- hadoop
link option here binds your flume container with hadoop
more info about this https://docs.docker.com/engine/userguide/networking/default_network/dockerlinks/