Search code examples
hadoopmapreducecloudera-cdhdistributed-cache

I am getting the error "The method addCacheFile(URI) is undefined for the type Job" with CDH4.0


I am getting the error

The method addCacheFile(URI) is undefined for the type Job

with CDH4.0 when trying to call the addCacheFile(URI uri) method, as shown below:

import java.net.URI;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

public class DistributedCacheDriver {

    public static void main(String[] args) throws Exception {
        String inputPath = args[0];
        String outputPath = args[1];

        String fileName = args[2];
        Configuration conf = new Configuration();
        Job job = Job.getInstance(conf, "TestingDistributedCache");
        job.setJarByClass(DistributedCache.class);


        job.addCacheFile(new URI(fileName)); //Getting error here -The method addCacheFile(URI) is undefined for the type Job

        boolean result = job.waitForCompletion(true);
        System.exit(result ? 0 : 1);
    }
}

Any suggestions/hints to get rid of this error?


Solution

  • If you have chosen to install MapReduce version 1, then you should replace the job.addCacheFile() command with DistributeddCache.addCacheFile(); and change the setup() method accordingly (call it configure()).

    Find some official documentation and examples here.