I'm using the Over function from Piggybank to get the Lag of a row
res= foreach (group table by fieldA) {
Aord = order table by fieldB;
generate flatten(Stitch(Aord, Over(Aord.fieldB, 'lag'))) as (fieldA,fieldB,lag_fieldB) ;}
This works correctly and when I do a dump I get the expected result, the problem is when I want to use lag_fieldB for any comparison or transformation I get datatype issues.
If I do a describe it returns fieldA: long,fieldB: chararray,lag_fieldB: NULL
I'm new with PIG but I already tried casting to chararray and using ToString() and I keep getting errors like these:
ERROR 1052: Cannot cast bytearray to chararray
ERROR 1051: Cannot cast to bytearray
Thanks for your help
Ok after some looking around into the code of the Over function I found that you can instantiate the Over class to set the return type. What worked for me was:
DEFINE ChOver org.apache.pig.piggybank.evaluation.Over('chararray');
res= foreach (group table by fieldA) {
Aord = order table by fieldB;
generate flatten(Stitch(Aord, ChOver(Aord.fieldB, 'lag'))) as (fieldA,fieldB,lag_fieldB) ;}
Now the describe is telling me
fieldA: long,fieldB: chararray,lag_fieldB: chararray
And I'm able to use the columns as expected, hope this can save some time for someone else.