Search code examples
javahadoopmapreducehdfs

I am practicing basic hadoop with IntelliJ


https://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html

I was trying to modify the code in following link. My goal was to implement last three codes of designating input and output source directory which is local to hdfs.

I am trying to access to my hadoop data path through wordcount java code.

and I already have hadoop database.

ex) I am trying to reach /user/prac/input directory through Java code

the original code that access to local directory is this

FileInputFormat.addInputPath(job, new Path("input")); 
FileOutputFormat.setOutputPath(job, new Path("output"));
System.exit(job.waitForCompletion(true) ? 0 : 1);

this is how I modified

String directoryName = "hdfs://localhost:9000/user/prac/";

FileInputFormat.addInputPath(job, new Path(directoryName + "input")); 
FileOutputFormat.setOutputPath(job, new Path( directoryName + "output"));
System.exit(job.waitForCompletion(true) ? 0 : 1);

Am I doing alright?

I tried to modify codes


Solution

  • The HDFS address is automatically gathered from $HADOOP_CONF_DIR/core-site.xml in the Hadoop SDK, so you do not need to explicitly set it

    Also, unless you use an absolute path, HADOOP_USER_NAME variable defaults all actions to read/write to your user folder.

    Therefore, the original code already works in ways you've tried to modify.

    Worth mentioning, no one really writes mapreduce code anymore. Spark or Flink will also run on Intellij.