Search code examples
dockerhadooph2ohdp

Can a docker image use hadoop?


Can a docker image access hadoop resources? Eg. submit YARN jobs and access HDFS; something like MapR's Datasci. Refinery, but for Hortonworks HDP 3.1. (May assume that the image will be launched on a hadoop cluster node).

Saw the hadoop docs for launching docker applications from hadoop nodes, but was interested in whether could go the "other way" (ie. being able to start a docker image with the conventional docker -ti ... command and have that application be able to run hadoop jars etc. (assuming that the docker image host is a hadoop node itself)). I understand that MapR hadoop has docker images for doing this, but am interested in using Hortonworks HDP 3.1. Ultimately trying to run h2o hadoop in a docker container.

Anyone know if this is possible or can confirm that this is not possible?


Solution

  • Yes. As long as you have the client jars and appropriate configs similar to an edge node for your container it should work.