Search code examples
hadoopapache-pigbigdata

Find Maximum Columns in a grouped row. [using PIG]


I have to find maximum number of posts created by person with some given set of data, where I am provided with user id, display name, age, comments count, view count, date, score and title of each post.

To get the number of maximum post, I think, we can group by user id.Now, after grouping, I need to check the id which has the most no. of columns. I don't understand how would I solve the latter part. Please help.


Solution

  • As What, I understand from your question. I am giving you answer Accordingly.

    Let be try this code :

    a = load '<path>' using PigStorage(',') as(userId,displayName,age,commentsCount,viewCount,date,score,title)
    
    b = group a by userId;
    
    c = foreach b generate group,COUNT(a.title);
    
    dump c;