Search code examples
hadoopapache-pigaverage

Apache Pig AVG Function


I'm trying to do a simple Pig query where I need to find the average rating for the movie with id 178. I've tried a number of versions of the below and the filter is working but not the AVG function. Can anyone advise? Thanks

a = load '/user/pig/u.data' AS (userid:int, movieid:int, rating:double, timestamp:chararray);  
b = FOREACH a GENERATE AVG(rating) as rate, movieid;
c = group b by rate;
d= filter a by movieid==178;
dump d;

Solution

  • You should group by movieid

    b = FILTER a BY (movieid == 178);
    c = GROUP b BY movied;
    d = FOREACH c GENERATE group AS movieid,AVG(a.rating) as rate;