Search code examples
linuxawksedgrep

Output text between two PATTERNS if text matches a CRITERIA


I want to select blocks between PATTERN1 and PATTERN2 if text inside the block contains CRITERIA, otherwise discard the whole block.

Sample task: Select text between PATTERN1='start' and PATTERN2='end', if some text between 'start' and 'end' matches CRITERIA='DCE', then output the whole block between 'start' and 'end'.

Sample input:

start
123
ABC
123
end
start
123
DCE
123
end
start
123
EFG
123
end

Sample output:

start
123
DCE
123
end

I've tried the following using awk, but couldn't find how to use CRITERIA between two patterns:

awk '/start/,/end/' input_file

Solution

  • EDIT: As per OP a Input_file may have match at the end too and may not have end string, so adding code as per that too now.

    awk '
    /start/{
      if(val)               {   print value   };
      flag=1;
      value=val=""}
    /[dD[cC][eE]/ && flag   {   val=1         }
    /end/                   {   flag=""       }
    flag{
      value=value?value ORS $0:$0
    }
    END{
      if(val)               {   print value   }}
    '  Input_file
    

    Explanation:

    awk '
    /start/{                                     ##Looking for string start in a line if found then do following.
      if(val)               {   print value   }; ##Checking if variable val is NOT NULL, if yes then print variable of value.
      flag=1;                                    ##Setting variable named flag as 1 here.
      value=val=""}                              ##Nullifying variables value and val here.
    /[dD[cC][eE]/ && flag   {   val=1         }  ##Searching string DCE/dce in a line and checking if variable flag is NOT NULL then set variable val as 1.
    /end/                   {   flag=""       }  ##Searching string end in current line, if found then Nullifying flag here.
    flag{                                        ##Checking if variable named flag is SET or NOT NULL here.
      value=value?value ORS $0:$0                ##Creating value whose value is current line value and concatenating in its own value.
    }
    END{                                         ##Starting END block of awk here.
      if(val)               {   print value   }} ##Checking if variable val is NOT NULL then print variable value here.
    '  Input_file
    

    Could you please try following awk and let me know if this helps you.

    awk '/start/{if(val){print value};flag=1;value=val=""} /[dD[cC][eE]/ && flag{val=1} /end/{flag=""} {value=value?value ORS $0:$0}' Input_file
    

    Adding a non-one liner form of solution too here.

    awk '
    /start/{
      if(val)            { print value  };
      flag=1;
      value=val=""}
    /[dD[cC][eE]/ && flag{  val=1       }
    /end/                {  flag=""     }
    {
      value=value?value ORS $0:$0
    }
    '   Input_file