Search code examples
macosif-statementawkzshcontains

awk, if else conditional when record contains a value


I'm having trouble getting an awk if/else conditional to properly trigger when the record contains a value. Running this in zsh on Mac OS Catalina.

This script (issue is on second to last line)...

echo "abcdefgh" >  ./temp
echo "abc\"\(\"h" >> ./temp
echo "abcdefgh" >> ./temp
echo "abcde\(h" >> ./temp 

val='"\("'
key="NEW_NEW"
file="./temp"

echo $val
echo $key
echo $file

echo ""
echo "###############"
echo ""

awk '
    BEGIN { old=ARGV[1]; new=ARGV[2]; ARGV[1]=ARGV[2]=""; len=length(old) }
    ($0 ~ /old/){ s=index($0,old); print substr($0,1,s-1) new substr($0,s+len) }{ print $0 }
' $val $key $file

outputs:

"\("
NEW_NEW
./temp

###############

abcdefgh
abc"\("h
abcdefgh
abcde\(h

I want to fix the script so that it changes the "\(" to NEW_NEW but skips the parenthesis without the quotes...

"\("
NEW_NEW
./temp

###############

abcdefgh
abcNEW_NEWh
abcdefgh
abcde\(h

EDIT

This is an abbreviated version of the real script that I'm working on. The answer will need to include the variable expansions that the sample above has, in order for me to use the command in the larger script. The ARGV format in use is preserving special characters, so the main question I have is why the conditional isn’t triggered as expected.


Solution

  • ($0 ~ /old/) means "do a regexp comparison between the current record ($0) and the literal regexp old" so it matches when $0 contains the 3 characters o, l, d in that order. You probably were trying to do a regexp comparison against the contents of the variable named old which would be $0 ~ old (see How do I use shell variables in an awk script?) but you don't actually want that, you want a string comparison which would be index($0,old) as shown in your previous question (https://stackoverflow.com/a/62096075/1745001) but which you have now for some reason moved out of the condition part of your condition { action } awk statement and put it as the first part of the action instead. So don't do that.

    The other major problem with your script is you're removing the quotes from around your shell variables so they're being interpreted by the shell and undergoing globbing, file name expansion, etc. before awk even gets to see them (see https://mywiki.wooledge.org/Quotes). So don't do that either.

    Fixing just the parts I mentioned:

    $ cat tst.sh
    echo "abcdefgh" >  ./temp
    echo "abc\"\(\"h" >> ./temp
    echo "abcdefgh" >> ./temp
    echo "abcde\(h" >> ./temp
    
    val='"\("'
    key="NEW_NEW"
    file="./temp"
    
    echo "$val"
    echo "$key"
    echo "$file"
    
    echo ""
    echo "###############"
    echo ""
    
    awk '
        BEGIN { old=ARGV[1]; new=ARGV[2]; ARGV[1]=ARGV[2]=""; len=length(old) }
        s=index($0,old) { $0 = substr($0,1,s-1) new substr($0,s+len) }
        { print }
    ' "$val" "$key" "$file"
    

    .

    $ ./tst.sh
    "\("
    NEW_NEW
    ./temp
    
    ###############
    
    abcdefgh
    abcNEW_NEWh
    abcdefgh
    abcde\(h