I need the following output.
NE 50
SE 80
I am using pig query to count the country based on zone.
c1 = group country by zone;
c2 = foreach c1 generate COUNT(country.zone), (
case country.zone
when 1 then 'NE'
else 'SE'
);
But I am not able to achieve my output. I am getting error like the following:
2016-03-30 13:57:16,569 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1039: (Name: Equal Type: null Uid: null)incompatible types in Equal Operator left hand side:bag :tuple(zone:int) right hand side:int
Details at logfile: /home/cloudera/pig_1459370643493.log
But I was able to do using following query.
c2 = foreach c1 generate group, COUNT(country.zone);
This will give following output:
(1,50)
(2,80)
How can I add NE instead of 1 and SE instead of 2? I thought using CASE would help but I am getting error. Can anyone help?
EDIT
Pig 0.12.0 Version now supports CASE expression.
c2 = FOREACH c1 GENERATE (CASE group
WHEN 1 THEN 'NE'
WHEN 2 THEN 'SE'
WHEN 3 THEN 'AE'
ELSE 'VR' END), COUNT(country.zone);
Older Pig Versions
Pig does not have a case statement.Your best option is to use UDF.If the group values are limited to only two then you can use bincond operator to check the value
c2 = foreach c1 generate (group == 1 ? 'NE' : 'SE'), COUNT(country.zone);
If you have multiple values then use this.I've used test values to generate the output.
Input
c2 = FOREACH c1 GENERATE (group == 1 ? 'NE' :
(group == 2 ? 'SE' :
(group == 3 ? 'AE' : 'VR'))), COUNT(country.zone);
Output