I'm trying to do a simple Pig query where I need to find the average rating for the movie with id 178. I've tried a number of versions of the below and the filter is working but not the AVG function. Can anyone advise? Thanks
a = load '/user/pig/u.data' AS (userid:int, movieid:int, rating:double, timestamp:chararray);
b = FOREACH a GENERATE AVG(rating) as rate, movieid;
c = group b by rate;
d= filter a by movieid==178;
dump d;
You should group by movieid
b = FILTER a BY (movieid == 178);
c = GROUP b BY movied;
d = FOREACH c GENERATE group AS movieid,AVG(a.rating) as rate;