Search code examples
hadoopapache-pig

How to use CASE statement in pig?


I need the following output.

NE 50
SE 80

I am using pig query to count the country based on zone.

c1 = group country by zone;
c2 = foreach c1 generate COUNT(country.zone), (
case country.zone
when 1 then 'NE'
else 'SE'
);

But I am not able to achieve my output. I am getting error like the following:

2016-03-30 13:57:16,569 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1039: (Name: Equal Type: null Uid: null)incompatible types in Equal Operator left hand side:bag :tuple(zone:int)  right hand side:int
Details at logfile: /home/cloudera/pig_1459370643493.log

But I was able to do using following query.

c2 = foreach c1 generate group, COUNT(country.zone);

This will give following output:

(1,50)
(2,80)

How can I add NE instead of 1 and SE instead of 2? I thought using CASE would help but I am getting error. Can anyone help?


Solution

  • EDIT

    Pig 0.12.0 Version now supports CASE expression.

    c2 = FOREACH c1 GENERATE (CASE group
                                  WHEN 1 THEN 'NE' 
                                  WHEN 2 THEN 'SE'
                                  WHEN 3 THEN 'AE'
                                  ELSE 'VR' END), COUNT(country.zone);
    

    Older Pig Versions

    Pig does not have a case statement.Your best option is to use UDF.If the group values are limited to only two then you can use bincond operator to check the value

    c2 = foreach c1 generate (group == 1 ? 'NE' : 'SE'), COUNT(country.zone);
    

    If you have multiple values then use this.I've used test values to generate the output.

    Input

    Input

    c2 = FOREACH c1 GENERATE (group == 1 ? 'NE' : 
                             (group == 2 ? 'SE' :
                             (group == 3 ? 'AE' : 'VR'))), COUNT(country.zone);
    

    Output

    Output