Search code examples
awkcomparisonfiltering

filter based on first two letters of a column


I have a file that looks like this:

345-103832 OI.S.15.0FKOGO   
345-103832 OX.S.5.0FKOGO   
345-103832 QX.S.3.0FKOGO  
345-103832 Qa.S.21.0FKOGO  
345-114643 IX.S.13.0FKOGY

I need to print all lines that column 2 does not start with "O", with "I" or have an "O or "I" as part of the first two letters.

So, I would like something like:

awk '{ if( $2 != * O. || $2 != O *. || $2 != * I. || $2 != I *.) print $0}' ...

In such a way that the result should be:

345-103832 QX.S.3.0FKOGO  
345-103832 Qa.S.21.0FKOGO

Can you help me on that?


Solution

  • You may use

    awk '$2 !~ /^.?[OI]/' file
    

    See the awk online demo.

    The '$2 !~ /^.?[OI]/' means: print all lines where Field 2 does not match:

    • ^ - start of line
    • .? - any 1 optional char
    • [OI] - either O or I.

    The first two letters must be letters, replace . with [[:alpha:]] or [A-Z], just choose what is best as per your requirements.