Search code examples
apache-pig

Conditional statements in PIG


I have below input in a text file and need to generate output in another file based on the logic. Here is my input file:

customerid|Dateofsubscription|Customercode|CustomerType|CustomerText
1001|2017-05-23|455|CODE|SPRINT56
1001|2017-05-23|455|DESC|Unlimited Plan
1001|2017-05-23|455|DATE|2017-05-05
1002|2017-05-24|455|CODE|SPRINT56
1002|2017-05-24|455|DESC|Unlimited Plan
1002|2017-05-24|455|DATE|2017-05-06

Logic:

If  Customercode = 455
if( CustomerType = "CODE" )
     Val= CustomerText
if( CustomerType = "DESC" )
    Description = CustomerText
if( CustomerType = "DATE" )
     Date = CustomerText

Output:

customerid|Val|Description|Date
1001|SPRINT56|Unlimited Plan|2017-05-05
1002|SPRINT56|Unlimited Plan|2017-05-06

Could you please help me with this.


Solution

  • rawData = LOAD data;
    filteredData = FILTER rawData BY (Customercode == 455);
    
    --Extract and set Val/Description/Date based on CustomerText and 'null' otherwise
    ExtractedData = FOREACH filteredData GENERATE
                customerId,
                (CustomerType == "CODE" ? CustomerText : null) AS Val,
                (CustomerType == "DESC" ? CustomerText : null) AS Description,
                (CustomerType == "DATE" ? CustomerText : null) AS Date;
    
    groupedData = GROUP ExtractedData  BY customerId;
    
    --While taking MAX, all 'nulls' will be ignored
    finalData = FOREACH groupedData GENERATE
                 group as CustomerId,
                 MAX($1.Val) AS Val,
                 MAX($1.Description) AS Description,
                 MAX($1.Date) AS Date;
    
    DUMP finalData;
    

    I have specified the core logic. Loading, formatting and storage should be straight-forward.