I am learning Apache Pig. I am trying to load some data in to pig. When i see the txt file in vi editor, I find the following (sample) row.
[ABBOTT,DEEDEE W GRADES 9-12 TEACHER 52,122.10 0 LBOE ATLANTA INDEPENDENT SCHOOL SYSTEM 2010].
I use the following command to load data into a pig relation.
A = LOAD 'salaryTravelReport_sample.txt' USING PigStorage() as (name:chararray,
prof:chararray,max_sal:float,travel:float,board:chararray,state:chararray,year:int);
However, when I do a dump in pig in the distributed environment, I find the following result (for the row mentioned above):
(ABBOTT,DEEDEE W,GRADES 9-12 TEACHER,,0.0,LBOE,ATLANTA INDEPENDENT SCHOOL SYSTEM,2010).
The numeric data "52,122.10 "
seems to be missing.
Please help.
PigStorage() is inbuilt function in pig which takes record delimiter as arguments. here its tab -- > \t
A = LOAD 'salaryTravelReport_sample.txt' USING PigStorage('\t') as (name:chararray,
prof:chararray,max_sal:float,travel:float,board:chararray,state:chararray,year:int);