I have been trying to run a Avro map-reduce on oozie. I specify the mapper and reducer class in the workflow.xml and provide other configs too. But it gives out an
java.lang.RunTime Exception - class mr.sales.avro.etl.SalesMapper not org.apache.hadoop.mapred.Mapper
The same job when run directly on a hadoop cluster (and not via oozie) gets completed and gives the desired output. So it seems probable that I may be missing some oozie config. What I guess from the exception is that oozie requires the mapper to be a subclass of org.apache.hadoop.mapred.Mapper
but Avro mappers have a different signature - they extend org.apache.avro.mapred.AvroMapper and this may be reason for the error.
So my question is how do I confiure oozie workflow/properties file to allow it to run an Avro map-reduce job.
With AVRO, you'll need to configure a few extra properties:
org.apache.avro.mapred.HadoopMapper
is the actual mapper class you need to set (this implements the Mapper interface)avro.mapper
property should name your SalesMapper
classThere are other properties for the combiner and reducer too - check the AvroJob source and the utility methods.
Another way of doing this is to examine the job.xml from a job you manually submitted, and copy over the relevant configuration properties to your oozie workflow.xml