Search code examples
apache-pig

extract the list of different movies for each year using Pig


I have a simple code in pig, I want to extract the number of films for every year, I loaded the content of the file in movies and I typed this code:

groupingyear = group movies by year;
vrar = foreach groupingyear generate movies.year, COUNT(movies.year); 

The result is fine! But I want to get the (year,number of films) structure and not this structure: (why the years are written many times?)

enter image description here


Solution

  • You are counting the years.Assuming you have a field movie_name in your movies dataset.

    groupingyear = group movies by year;
    vrar = foreach groupingyear generate group, COUNT(movies.movie_name);