I want to concatenate all records using Pig. After load in the data with "pigStorage" and '-tagFile' label, my data looks like:
(filename, aaaaaaaaaaa)
(filename, bbbbbbbbbbbbbb)
And the result I prefer is:
(filename, aaaaaaaaaaabbbbbbbbbbbbbb)
Then I can store the data into HBase with filename as rowkey.
Any suggestion will be appreciated.
GROUP the data by the filename and then use BagToString to CONCAT all bags to a single string.
B = GROUP A BY filename;
C = FOREACH B GENERATE group,BagToString(A.$1,'');
DUMP C;