I annotated 500 sequences with Prokka from which I need to specifically extract only TcdA gene from all sequences, I need use the annotation of .ffn file of all sequences.
¿How can I do this automatically without having to open each folder of each sequence noted?
Prokka files:
Strain1
>Strain1.err
>Strain1.faa
>Strain1.fna
>Strain1.ffn *I use this file for extract gene*
I need the TcdA gene of the 500 sequences
Strain1_01428 glycosylating toxin TcdA ATGTCTTTAATATCTAAAGAAGAGTTAATAAAACTCGCATATAGCATTAGACCAAGAGAA AATGAGTATAAAACTATATTAACTAATTTAGACGAATATAATAAGTTAACTACAAACAAT AATGAAAATAAATATTTACAATTAAAAAAACTAAATGAATCAATTGATGTTTTTATGAAT AAATATAAAAATTCAAGCAGAAATAGAGCACTCTCTAATCTAAAAAAAGATATATTAAAA GAAGTAATTCTTATTAAAAATTCCAATACAAGTCCTGTAGAAAAAAATTTACATTTTGTA
something like:
for i in /path/to/*.ffn; do awk 'BEGIN {RS=">"} /glycosylating toxin TcdA/ {print ">"$0}' $i > TcdA.fasta; done