I want to get the last element of a line using pig script. I cant use $ as the index of last element is not fixed. I tried using Regular Expression but it is not working. I tried using $-1 to get it but it didn't work. I am posting only a sample as my actual file contains more of PID's.
Sample:
MSH|�~\&|LAB|LAB|HEATH|HEA-HEAL|20247||OU�R01|M1738000000001|P|2.3|||ER|ER|
PID|1|YXQ120185751001|YXQ120185751001||ELJKDP@#PDUB||19790615|F||| H LGGH VW��ZHVW FKHVWHU�SD�19380|||||||4002C340778A|000009561|ELJKDP@#PDUB19790615F
i want ot get the last value of PID i;e ELJKDP@#PDUB19790615F
for that i have tried below code's but it is not working.
Code 1:
STOCK_A = LOAD '/user/rt/PARSED' USING PigStorage('|');
data = FILTER STOCK_A BY ($0 matches '.*PID.*');
MSH_DATA = FOREACH data GENERATE $2 AS id, $5 AS ame , $7 AS dob, $8 AS gender, $-1 AS rk;
Code 2:
STOCK_A = LOAD '/user/rt/PARSED' USING PigStorage('|');
data = FILTER STOCK_A BY ($0 matches '.*PID.*');
MSH_DATA = FOREACH data GENERATE $2 AS id, $5 AS ame , $7 AS dob, $8 AS gender, REGEX_EXTRACT(data,'\\s*(\\w+)$',1) AS rk;
Error for Code 2:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: Pig script failed to parse: Invalid scalar projection: data : A column needs to be projected from a relation for it to be used as a scalar
Please help
This should work
REGEX_EXTRACT(data,'([^|]+$)',1) AS rk
[^|]+$ matches everything to the right of the last pipe character.