Search code examples
hadoopdockerdockerfileflume

Docker intercontainer communication


I would like to run Hadoop and Flume dockerized. I have a standard Hadoop image with all the default values. I cannot see how can these services communicate each other placed in separated containers.

Flume's Dockerfile looks like this:

FROM ubuntu:14.04.4

RUN apt-get update && apt-get install -q -y --no-install-recommends wget

RUN mkdir /opt/java
RUN wget --no-check-certificate --header "Cookie: oraclelicense=accept-securebackup-cookie" -qO- \
  https://download.oracle.com/otn-pub/java/jdk/8u20-b26/jre-8u20-linux-x64.tar.gz \
  | tar zxvf - -C /opt/java --strip 1

RUN mkdir /opt/flume
RUN wget -qO- http://archive.apache.org/dist/flume/1.6.0/apache-flume-1.6.0-bin.tar.gz \
  | tar zxvf - -C /opt/flume --strip 1

ADD flume.conf /var/tmp/flume.conf
ADD start-flume.sh /opt/flume/bin/start-flume

ENV JAVA_HOME /opt/java
ENV PATH /opt/flume/bin:/opt/java/bin:$PATH

CMD [ "start-flume" ]

EXPOSE 10000

Solution

  • You should link your containers. There are some variants how you can implement this.

    1) Publish ports:

    docker run -p 50070:50070 hadoop

    option p binds port 50070 of your docker container with port 50070 of host machine

    2) Link containers (using docker-compose)

    docker-compose.yml

    version: '2'
    services:
     hadoop:
      image: hadoop:2.6
     flume:
     image: flume:last
     links:
     - hadoop
    

    link option here binds your flume container with hadoop

    more info about this https://docs.docker.com/engine/userguide/networking/default_network/dockerlinks/