How to extract a pattern but fill missing values in bash?

I have a large tab delimited file (dummy.vcf) with a column of ';' delimited variables. For example:

AF_female=0.00000e+00;non_topmed_AF_female=0.00000e+00;control_AF_female=0.00000e+00
control_AF_female=0.00000e+00;non_topmed_AF_female=0.00000e+00
AF_female=0.00008e+00;non_topmed_AF_female=0.00000e+00

I would like to extract the "AF_female=X" string for each row with missing values filled in, so the new file is the same length as the original. For example:

AF_female=0.00000e+00  
NA  
AF_female=0.00008e+00

I have tried:

grep -o ';AF_female=[0-9].[0-9]*..[0-9]*' dummy.vcf

However, this does not add rows for when the pattern is not matched.

Any help will be very much appreciated!

Solution

could you please try following if you are ok with awk.

awk -F';' '
{
  val=""
  for(i=1;i<=NF;i++){
     if($i ~ /^AF_female=[0-9]+/){
         val=(val?val OFS $i:$i)
     }
  }
  if(val){
     print val
  }
  else{
     print "NA"
  }
}'  Input_file

It should check all present values of AF_female=digits in a line and will print NA in case it finds NULL matches on a line too.

Output will be as follows.

AF_female=0.00000e+00
NA
AF_female=0.00008e+00

Explanation: Adding explanation for above command now.

awk -F';' '                           ##Starting awk program here and setting up field separator as semi-colon here.
{
  val=""                              ##Nullifying value of variable val here.
  for(i=1;i<=NF;i++){                 ##using a for loop which starts from i=1 to i=NF value. Where NF is number of fields value in current line.
     if($i ~ /^AF_female=[0-9]+/){    ##Checking condition if a field starts from AF_female and then digits then do following.
         val=(val?val OFS $i:$i)      ##Creating variable val whose value is current field value and concatenating its own value.
     }
  }
  if(val!=""){                        ##After coming out of for loop checking if variable val value is NOT NULL then do following.
     print val                        ##Printing value of variable val here.
  }
  else{                               ##Mentioning else of above if condition here.
     print "NA"                       ##Printing NA here.
  }
}' Input_file                         ##Mentioning Input_file name here.