Search code examples
hadoophdfsapache-pig

Getting error in loading data in pig with year


I wan to Write a pig latin script: where I have to load all the data after 1951 (1951 is not included) and Filter the data where quality =1 Group the data by temperature, and then calculate the largest year for each temperature.

did this

records = load '/user/a106524609/test.txt' using PigStorage(' ') as 
(year:chararray, temperature:int, quality:int);
rec1 = filter records by year >1951 and (quality == 1);

I am getting this error enter image description here


Solution

  • You are loading year to a chararray field and comparing it with 1951 which is an int.You have two option.Load the year into an int or else in the filter statement cast the year to int.

    records = load '/user/a106524609/test.txt' using PigStorage(' ') as 
    (year:int, temperature:int, quality:int);
    rec1 = filter records by year > 1951 and (quality == 1);
    

    Or

    records = load '/user/a106524609/test.txt' using PigStorage(' ') as 
    (year:chararray, temperature:int, quality:int);
    rec1 = filter records by year:int > 1951 and (quality == 1);-- Note (int)year > 1951 should work too