Search code examples
apache-pig

How to right Pig Script if line contains more than one same delimiter?


Here I have a line in my "test.csv" file as follows:

1987654,file not uploaded,please try again,Johnson

I would like to get output as follows using Pig

Task ID
1987654

Message

file not uploaded,please try again

User

Johnson


Solution

  • Since all lines have the same format, the simple solution is to load it into 4 fields with comma as the delimiter and then use CONCAT to join the 2nd and 3rd field along with a comma.

    A = LOAD 'data.txt' USING PigStorage(',') AS (a1:int,a2:chararray,a3:chararray,a4:chararray);
    B = FOREACH A GENERATE a1,CONCAT(CONCAT(a2,','),a3),a4;
    DUMP B;