I have a pig input file which looks like this:
1, cornflakes, Regular, Post, 10
2, cornflakes, Regular,General Mills, 12
3, cornflakes, Mixed Nuts, Post, 14
4, chocolate syrup, Regular, Hersheys, 5
5, chocolate syrup, No High Fructose, Hersheys, 8
6, chocolate syrup, Regular, Ghirardeli, 6
7, chocolate syrup, Strawberry Flavor, Ghirardeli, 7
I need to filter out the cornflakes which is less than 12 and I need to use the original set of data for the next step of filtering.
total = LOAD 'location_of_file' using PigStorage('\t') as (item_sl : int, item : chararray, type: chararray, manufacturer: chararray, price : int);
filter1 = FILTER total BY item == 'cornflakes' AND price < 12;
Now I need to use the original dataset after filter1 for the next step of filtering.
Use SPLIT
total = LOAD '/output/systemhawk/file_inventory/test34.txt' USING PigStorage(',') AS (item_sl : int, item : chararray, type: chararray, manufacturer: chararray, price : int);
SPLIT total INTO filter1 IF (item == 'cornflakes' AND price < 12),filter2 OTHERWISE;
DUMP filter2;