Search code examples
javawindowshadoopmapreduceword-count

Output of hadoop wordcount mapreduce example is empty on windows (hadoop run locally)


Hi this is my first time ask a question on stackoverflow and my English is bad. I have google many times but still not find out the solution for my problem.

My problem is after running the mapreduce named wordcount by eclipse on windows. I have a folder: output\wordcount but nothing in this folder (image behind): Image of the output folder

This is my WordCount.java

package de.tud.stg;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.log4j.BasicConfigurator;

import java.io.IOException;
import java.util.StringTokenizer;

public class WordCount {

    public static class WordMapper extends Mapper<Object, Text, Text, IntWritable> {
        private final static IntWritable one = new IntWritable(1);
        private Text word = new Text();

        @Override
        protected void map(Object key, Text value, Context context) throws IOException, InterruptedException {
            StringTokenizer tokenizer = new StringTokenizer(value.toString());
            while (tokenizer.hasMoreTokens()) {
                word.set(tokenizer.nextToken());
                context.write(word, one);
            }
        }
    }

    public static class IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
        private IntWritable result = new IntWritable();

        @Override
        protected void reduce(Text key, Iterable<IntWritable> values, Context context)
                throws IOException, InterruptedException {
            int sum = 0;
            for (IntWritable val : values) {
                sum += val.get();
            }
            result.set(sum);
            context.write(key, result);
        }
    }

    public static void main(String[] args) throws Exception {

        Configuration conf = new Configuration();
        Job job = Job.getInstance(conf, "WordCount");
        job.setJarByClass(WordCount.class);
        job.setMapperClass(WordMapper.class);
        job.setReducerClass(IntSumReducer.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));
        System.exit(job.waitForCompletion(true)? 0 : 1);
    }
}

This is my pom.xml file

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>de.tud.stg</groupId>
    <artifactId>hadoop</artifactId>
    <version>1.0-SNAPSHOT</version>
    <dependencies>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>2.7.2</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-mapreduce-client-common</artifactId>
            <version>2.7.2</version>
        </dependency>
    </dependencies>
</project>

*Update

Before that I have an error. I solved the problem by solution in link: firt problem first problem solution

I also met another problem and solved by this: second problem second problem solution

Then I have this problem. I think the problem come from: job.waitForCompletion(true) return false value

I really appreciate your help

*Update my input files:

file01

file02

*Update my input, output arguments:

enter image description here


Solution

  • After trying so many times

    The problem is your current user name having a space

    You can check by go to the path C:\Users

    So the solution is:

    1. Delete the output folder (if exist).
    2. Move the Eclipse's file location out of the C:\Program Files.
    3. Create a new user (not having space in the user name): Start > Settings > Accounts > Family & other users > Add account.
    4. Change the new account type to administrator.
    5. Restart and Login with your new account.
    6. Retry running the MapReduce program.
    7. The problem will disappear.

    Good luck...