Search code examples
hadoopapache-pig

Access schema value in pig


Dataset - Contains PostId and userID

20    1
21    2
45    3
85    1
48    1
98    1
74    1
96    2
63    2
33    3
44    3
55    3
66    3
77    3

I want to access the userID with maximum no. of post

PIG code

A = load '/home/cloudera/Desktop/post.txt' as (postid:chararray, userid:chararray);

B = load '/home/cloudera/Desktop/user.txt' as (name:chararray, id:chararray);

C = group A by userid;

D = foreach C generate group,COUNT(A.postid) as count;

E = order D by count DESC;

F = limit D 1;

It gives output -

(3,6)

Now what should be the PIG statement to access username from user.txt whose id is same as A.userid after execution of F statement?


Solution

  • Add another statement to get the first column from relation F

    G = FOREACH F GENERATE $0;
    DUMP G;