Search code examples

ArrayIndexOutOfBoundsException in MapReduce

I am getting array index out of bound error in MAP part. My code is as below. I am trying to read the input file from the HDFS. Is there any better way to read the HDFS file?

public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, Text>
                private Text key12 = new Text();
                private Text value = new Text();

                public void map(LongWritable key, Text value, OutputCollector<Text, Text> output, Reporter reporter) throws IOException
                        String line=value.toString();
                        while((line = value.toString()) != null)
                                        //StringTokenizer tokenizer = new StringTokenizer(line);
                                        //String field = tokenizer.nextToken();
                                        String[] parts= line.split(" ");

                                        if(parts[0].contains("STN") == false)
                                                String field=parts[0];
                                                String month=parts[3];
                                                String temp;
                                                //String month = tokenizer.nextToken();

                                                //String temp = tokenizer.nextToken();

                                                String val = month+temp;

                                                output.collect(key12, value);


  • There are an awful lot of places where this could go wrong, irrespective of where this particular error is. What if parts doesn't have 9 elements? What if it does have 9 elements but some of them are null? What if line doesn't have a space character in it? What if month only has three characters in it?

    Handle all of these situations and your issue will be resolved.

    As an aside, use


    instead of

     if(parts[0].contains("STN") == false)

    and consider extracting some of your Strings (such as "STN" and " " into private static final String variables. This will greatly improve your performance.