Search code examples
hadoopapache-pigclouderabigdata

How to retrieve previous row value in Pig


Hi i am using Pig to move values in HBASE. I am trying to execute on condition if it is success i'll Concatenate a value, if it fails i'll concatenate value of previous row. for that i tried below code but it is not working and throwing error.

Code:

STOCK_A = LOAD '/user/cloudera/pat.hl7' USING PigStorage('|');
data = FILTER STOCK_A BY ($0 matches '.*OBR.*' or $0  matches '.*OBX.*');
MSH_DATA = FOREACH data GENERATE ($0 == 'OBR' ? CONCAT('HL','OBR',(chararray)$1) : CONCAT('HL','OBR',(chararray)(data -1).$1)) AS Uid, $1 AS id, $5 AS result, $3 AS resultname;

Error:

ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <line 14, column 122>  mismatched input '.' expecting RIGHT_PAREN

I want that concatenated value to be replicated in other rows till i reach another OBR. Please Help.


Solution

  • You can't refer to previous rows in Pig itself, but you can write an aggregate UDF that will accept all rows and do the required. But keep in mind that you also need to specify parallelism 1 or your rows will be split in chunks