use spark to read avro data and got org.apache.avro.util.Utf8 cannot be cast to java.lang.String Exception

I am using following code to read avro in spark:

val inputData = sc.hadoopFile(inputPath,
  classOf[AvroInputFormat[GenericRecord]],
  classOf[AvroWrapper[GenericRecord]]).map(t => 
{ val genericRecord = t._1.datum()  
(String)genericRecord.get("name") });

the loading part works fine but the converting to String part failed:

Caused by: java.lang.ClassCastException: org.apache.avro.util.Utf8 cannot be cast to java.lang.String

To simplify the example, I use a line

(String)genericRecord.get("name")

Actually that part is from a library, which is used fine in a hadoop map reduce job. However, when I am using that library in spark now, it failed because of the above exception.

I know I can change the code to genericRecord.get("name").toString() to make it work, but because I am using it fine in another hadoop mapreduce job, I am hoping all the utf8 could be automatically converted to String so that I don't need to change all the code logic.

To sum up, How to make all the org.apache.avro.util.Utf8 in GenericRecord automatically converted into java.lang.String?

Solution

looks like the solution is to use AvroKey instead of AvroWrapper. below code works, all theorg.apache.avro.util.Utf8 will automatically convert to java.lang.String. No exception any more.

val inputData = sc.newAPIHadoopFile(inputPath,
classOf[AvroKeyInputFormat[GenericRecord]],
classOf[AvroKey[GenericRecord]],
classOf[NullWritable]).map(t => 
{ val genericRecord = t._1.datum()  
(String)genericRecord.get("name") });