Search code examples
hadoopapache-pig

How to dump a particular column from a row of a text file in pig ?


I am new to pig.I have my data in .txt file and i want to retrieve a particular column from this text file.The columns are separated with ; in this text file.

For example, if the row is

1;1;13;2010-09-13T19:16:26.763;239;383084;10;16575;2013-04-05T15:50:48.133;2015-11-21T04:55:50.150;I've rooted my phone. Now what? What do I gain from rooting?;2;0;162;2011-01-25T08:44:10.820; ,

then i want to retrieve the 4th column from the above row.

So,what should be the pig script to retrieve the 4th column i.e (239).


Solution

  • You have semi-colon as delimiter use PigStorage

    A = LOAD '/path/to/file' USING PigStorage(';');
    dump A
    

    Output of dump A:

    (1,1,13,2010-09-13T19:16:26.763,239,383084,10,16575,2013-04-05T15:50:48.133,2015-11-21T04:55:50.150,I've rooted my phone. Now what? What do I gain from rooting?,2,0,162,2011-01-25T08:44:10.820)

    B =foreach A generate $4;
    dump B
    

    Output of dump B

    (239)

    You can use AS in load command if u want to give names to your column and retrive with that name

     A = LOAD '/path/to/file' USING PigStorage(';') AS(col1,col2...);
    
     Dumping given column with name. 
     B =foreach A generate col1;
     dump B