I have created a HAR
file containing multiple small input files. For running a map reduce job with a single input file, this willbe the command:
hadoop jar <jarname> <packagename.classname> <input> <output>
But if in case the above <input>
is a HAR file the what will be the command such that all the contents of the HAR
file is considered as input?
If the input is a HAR file then in the place of input the following has to be given
har:///hdfs path to har file
Since hadoop archives will be exposed as filesystem, mapreduce will be able to use all the files in hadoop archives as input.