Search code examples
bashshellawksedgawk

Read columns from a file into variables and use for substitute values in another file


I have following file : input.txt

b73_chr10   w22_chr9
w22_chr7    w22_chr10
w22_chr8    w22_chr8

I have written the following code(given below) to read the first and second column and substitute the values of first column with values in second column in output.conf file .For example, I would like to change the value b73_chr10 with w22_chr9,w22_chr7 with w22_chr10,w22_chr8 with w22_chr8 and keep doing for all the values till the end.

value1=$(echo $line| awk -F\ '{print $1}' input.txt)
value2=$(echo $line| awk -F\ '{print $2}' input.txt)
sed -i '.bak' 's/$value1/$value2/g' output.conf 
cat output.conf

output.conf

    <rules>
    <rule>
    condition =between(b73_chr10,w22_chr1)
    color = ylgn-9-seq-7
    flow=continue
    z=9
    </rule>
    <rule>
    condition =between(w22_chr7,w22_chr2)
    color = blue
    flow=continue
    z=10
    </rule>
    <rule>
    condition =between(w22_chr8,w22_chr3)
    color = vvdblue
    flow=continue
    z=11
    </rule>
    </rules>

I tried the commands(as above),but it is leaving blank file for me.Can anybody guide where I went wrong ?


Solution

  • I suspect that sed by itself is the wrong tool for this. You can however do what you're asking in bash alone:

    #!/usr/bin/env bash
    
    # Declare an associative array (requires bash 4)
    declare -A repl=()
    
    # Step through our replacement file, recording it to an array.
    while read this that; do
      repl["$this"]="$that"
    done < inp1
    
    # Read the input file, replacing things strings noted in the array.
    while read line; do
      for string in "${!repl[@]}"; do
        line="${line/$string/${repl[$string]}}"
      done
      echo "$line"
    done < circos.conf
    

    This approach of course is oversimplified and therefore shouldn't be used verbatim -- you'll want to make sure you're only editing the lines that you really want to edit (verifying that they match /condition =between/ for example). Note that because this solution uses an associative array (declare -A ...), it depends on bash version 4.

    If you were to solve this with awk, the same basic principle would apply:

    #!/usr/bin/awk -f
    
    # Collect the tranlations from the first file.
    NR==FNR { repl[$1]=$2; next }
    
    # Step through the input file, replacing as required.
    {
      for ( string in repl ) {
        sub(string, repl[string])
      }
    }
    
    # And print.
    1
    

    You'd run this with the first argument being the translation file, and the second being the input file:

    $ ./thisscript translations.txt circos.conf