Search code examples
bashawkxmllint

How can I escape special characters that is entered in a function which contains awk command?


To make this problem simple to demonstrate, I made a fake xml file like this.

<abc>
      <spirit:addressBlock>
        <spirit:name>cmn700_registers</spirit:name>
          <def>
          </def>
      </spirit:addressBlock>
</abc>

And I want to print lines containing pattern <spirit:name> inside a block of lines, the block begining with the pattern <spirit:addressBlock> and ending with </spirit:addressBlock>. I defined a function in .bash_aliase like this.

function SearchPatInBlk {
awk "/$1/{inblk=1} inblk==1&&/$2/{inblk=0} inblk==1&&/$3/{print \$0}" $4
}

So the first argument and second argument is the block start and end pattern, third argument is the pattern I want to print the line with and the fourth argument is the xml filename. And then I gave this command at the bash shell.

SearchPatInBlk <spirit:addressBlock> </spirit:addressBlock> <spirit:name> ../../ab21/ab21_cmn700_new10_clst/build/ab21_cmn700/logical/cmn700/ipxact/cmn700_ab21.xml

Of course this gives me an error.

bash: syntax error near unexpected token `<'

So I tried putting some escape characters (\) before <,>,/ but it doesn't work. How should I do it?


Solution

  • Using a true XML parser would be better than a general purpose text processor like awk. But if you absolutely need awk there are several things to fix.

    • Quote your pattern strings.
    • Escape regex operators in your pattern strings.
    • Pass your pattern strings to awk as awk variables, not as parts of the awk script.
    • Use the regex,regex awk range pattern.

    Optionally you could also use more accurate regex and, if your awk is GNU awk, mark the patterns as regex constants (@/.../):

    function SearchPatInBlk {
      awk -v v1="$1" -v v2="$2" -v v3="$3" 'v1,v2 {if($0 ~ v3) print}' "$4"
    }
    
    SearchPatInBlk '@/^[[:space:]]*[<]spirit:addressBlock[>][[:space:]]*$/' \
      '@/^[[:space:]]*[<][/]spirit:addressBlock[>][[:space:]]*$/' \
      '@/[<]spirit:name[>]' file