Search code examples
hadoopapache-pig

Filter a string on the basis of a word


I have a pig job where in I need to filter the data by finding a word in it,

Here is the snippet

A = LOAD '/home/user/filename' USING PigStorage(',');
B = FOREACH A GENERATE $27,$38;
C = FILTER B BY ( $1 ==  '*Word*');
STORE C INTO '/home/user/out1' USING PigStorage();

The error is in the 3rd line while finding C, I have also tried using

C = FILTER B BY $1 MATCHES '*WORD*'  

Also

C = FILTER B BY $1 MATCHES '\\w+WORD\\w+'  

Solution

  • MATCHES uses regular expressions. You should do ... MATCHES '.*WORD.*' instead.

    These is an example here finding the word 'apache'.