Search code examples
apache-pigapache-pig-grunt

SUM function in PIG


Starting to learn Pig latin scripting and stuck on below issue. I have gone through similar questions on the same topic without any luck! Want to find SUM of all the age fields.

  DUMP X;
(22)(19)
grunt> DESCRIBE X;
X: {age: int}

I tried several options such as :

Y = FOREACH ( group X all ) GENERATE SUM(X.age);

But, getting below exception.

 Invalid field projection. Projected field [age] does not exist in schema: group:chararray,X:bag{:tuple(age:int)}.

Thanks for your time and help.


Solution

  • I think the Y projection should work as you wrote it. Here's mi little example code for the same and that's just work fine for me.

     X = LOAD 'SO/sum_age.txt' USING PigStorage('\t') AS (age:int);
     DESCRIBE X;
     Y = FOREACH ( group X all ) GENERATE 
         SUM(X.age);
     DESCRIBE Y;
     DUMP Y;
    

    So you your problem looks strange. I used the following input data:

    -bash-4.1$ cat sum_age.txt 
    22
    19
    

    Can you make a try on the same data with script I inserted here?