My S3 directory is
/sssssss/xxxxxx/rrrrrr/xx/file1
/sssssss/xxxxxx/rrrrrr/xx/file2
/sssssss/xxxxxx/rrrrrr/xx/file3
/sssssss/xxxxxx/rrrrrr/yy/file4
/sssssss/xxxxxx/rrrrrr/yy/file5
/sssssss/xxxxxx/rrrrrr/yy/file6
How my mapreduce program to read these files on S3?
For one input path you do the following:
FileInputFormat.addInputPath(job, new Path("/sssssss/xxxxxx/rrrrrr/xx/"));
For two input paths, you do the following:
FileInputFormat.addInputPath(job, new Path("/sssssss/xxxxxx/rrrrrr/xx/"));
FileInputFormat.addInputPath(job, new Path("/sssssss/xxxxxx/rrrrrr/yy/"));
or use addInputPaths()
. See the documentation of FileInputPath
(depending on your version of Hadoop) for more details.