apologies if this has been asked before - I could not find an answer to my question though
assuming there is the following file
touch test2
echo "refH.fasta = ref/GCA_013924565.1_ASM1392456v1_genomic.fasta" >> test2
echo "subjectGCA_017717955.1_PDT000990484.1_genomic_querry.fasta = GCA_017717955.1_PDT000990484.1_genomic.fasta" >> test2
echo "file=subjectGCA_017717955.1_PDT000990484.1_genomic_querry" >> test2
in the above file I would like to remove the dots ONLY between the strings 'subject' and '_querry' but not in the rest of the file.
Therefore the output should look like this:
refH.fasta = ref/GCA_013924565.1_ASM1392456v1_genomic.fasta
subjectGCA_0177179551_PDT0009904841_genomic_querry.fasta = GCA_017717955.1_PDT000990484.1_genomic.fasta
file=subjectGCA_0177179551_PDT0009904841_genomic_querry
thanks
Here is a Ruby to do that:
ruby -lpe '$_=$_.split(/(subject.*?_query)/).
map{|s| s=s[/subject.*?_query/] ? s.gsub(/\./,"") : s}.join' test2
Or Perl:
perl -lnE '@a=(); for $x (split /(subject.*?_query)/){
$x=~s/\.//g if $x=~/subject.*?_query/;
push @a,$x }
say join("",@a)' test2
It is possible entirely in Bash:
while IFS= read -r line || [[ -n $line ]]; do
if [[ $line =~ (subject.*_query) ]]; then
line=${line/""${BASH_REMATCH[1]}""/""${BASH_REMATCH[1]//./}""}
fi
printf "%s\n" "$line"
done <test2
But this only handles one match per line.