If I have the following data structure (a relation) in Pig and I want to pass it to a Java UDF, wondering what should be the related Java data type of the input parameter?
(student relation is a bag, schema is ID as int, a tuple contains an interest bag and a classes bag).
student: {id: int,(interest: {(value: chararray)},classes: {(value: chararray)})}
thanks in advance, Lin
I think it can be done as shown below.
public class BagUdf extends EvalFunc<DataBag> {
public <returnType> exec(Tuple input) throws IOException {
//iterate over the bag elements
for (Tuple t : (DataBag)input.get(0)) {
// process tuple t
}
return returnVal;
}
Please refer to this link