Search code examples
awktoupper

awk: identify column by condition, change value, and finally print all columns


I want to extract the value in each row of a file that comes after AA. I can do this like so:

awk -F'[;=|]' '{for(i=1;i<=NF;i++)if($i=="AA"){print toupper($(i+1));next}}'

This gives me the exact information I need and converts to uppercase, which is exactly what I want to do. How can I do this and then print the entire row with this altered value in its previous position? I am essentially trying to do a find and replace where the value is changed to uppercase.

EDIT:

Here is a sample input line:

11  128196  rs576393503 A   G   100 PASS    AC=453;AF=0.0904553;AN=5008;NS=2504;DP=5057;EAS_AF=0.0159;AMR_AF=0.0259;AFR_AF=0.3071;EUR_AF=0.006;SAS_AF=0.0072;AA=g|||;VT=SNP

and here is a how I would like the output to look:

11  128196  rs576393503 A   G   100 PASS    AC=453;AF=0.0904553;AN=5008;NS=2504;DP=5057;EAS_AF=0.0159;AMR_AF=0.0259;AFR_AF=0.3071;EUR_AF=0.006;SAS_AF=0.0072;AA=G|||;VT=SNP

All that has changed is the g after AA= is changed to uppercase.


Solution

  • Following awk may help you on same.

    awk '
    {
      match($0,/AA=[^|]*/);
      print substr($0,1,RSTART+2) toupper(substr($0,RSTART+3,RLENGTH-3)) substr($0,RSTART+RLENGTH)
    }
    '   Input_file