eclipse hadoop local hadoop-yarn word-count

WordCount job is running on 'localjobrunner' instead of 'yarn'

I am running WordCount example in eclipse luna 3.8. My job is running fine on localjobrunner but I want it to run on yarn cluster because want to access hadoop logs. Somewhere I read that if job is running on local then it do not create logs until it submit to the resource manager. Submitting job to resource manager is possible only when job is running on yarn.

My working environment:

hadoop-2.6.0 running as pseudo distribute mode.

eclipse luna 3.8.

Any help will be appreciated.

Solution

You need yarn-site.xml and core-site.xml correctly on your classpath as well as all yarn and mapreduce jars(dependencies). Now you may have those jars from maven or something but you most likely missing those config files. You can set these on classpath from "Run as configuration" in eclipse. I assume you have local hadoop installation with these configuration files and you can run hadoop commands. In that case you can point your classpath to that installation's conf dir and lib dirs. It may be tedious but start with just pointing to conf dir (which contains core-site and yarn-site) first and see if that works. If not then also exclude your eclipses local dependencies (maven or similar) of yarn and mapreduce and explicitly set them from your installation dir. check this article for setting classpath for hadoop1: https://letsdobigdata.wordpress.com/2013/12/07/running-hadoop-mapreduce-application-from-eclipse-kepler/

Here's another article from MapR (ignore mapr client related setup) https://mapr.com/blog/basic-notes-on-configuring-eclipse-as-a-hadoop-development-environment-for-mapr/

You can do similar steps for hadoop2(yarn) but basic idea is your application runtime has to pickup correct jars and config files on classpath to be able to successfully deploy it on cluster.