I am new to pig.I have my data in .txt file and i want to retrieve a particular column from this text file.The columns are separated with ;
in this text file.
For example, if the row is
1;1;13;2010-09-13T19:16:26.763;239;383084;10;16575;2013-04-05T15:50:48.133;2015-11-21T04:55:50.150;I've rooted my phone. Now what? What do I gain from rooting?;2;0;162;2011-01-25T08:44:10.820;
,
then i want to retrieve the 4th column from the above row.
So,what should be the pig script to retrieve the 4th column i.e (239)
.
You have semi-colon as delimiter use PigStorage
A = LOAD '/path/to/file' USING PigStorage(';');
dump A
Output of dump A:
(1,1,13,2010-09-13T19:16:26.763,239,383084,10,16575,2013-04-05T15:50:48.133,2015-11-21T04:55:50.150,I've rooted my phone. Now what? What do I gain from rooting?,2,0,162,2011-01-25T08:44:10.820)
B =foreach A generate $4;
dump B
Output of dump B
(239)
You can use AS in load command if u want to give names to your column and retrive with that name
A = LOAD '/path/to/file' USING PigStorage(';') AS(col1,col2...);
Dumping given column with name.
B =foreach A generate col1;
dump B