im trying to read a .fasta file as a dictionary and extract the header and sequence separately.there are several headers and sequences in the file. an example below.
header= CMP12
sequence=agcgtmmnngucnncttsckkld
but when i try to read a fasta file using the function read_f and test it using print(dict.keys()) i get an empty list.
def read_f(fasta):
'''Read a file from a FASTA format'''
dictionary = {}
with open(fasta) as file:
text = file.readlines()
print(text)
name=''
seq= ''
#Create blocks of fasta text for each sequence, EXCEPT the last one
for line in text:
if line[0]=='>':
dictionary[name] = seq
name=line[1:].strip()
seq=''
else: seq = seq + line.strip()
yield name,seq
fasta= ("sample.prot.fasta")
dict = read_f(fasta)
print(dict.keys())
this is the error i get:
'generator' object has no attribute 'keys'
Using the yield
keyword implies that when you call the function read_fasta
, the function is not executed. Instead, a generator is returned and you have to iterate this generator to get the elements the function yields.
In concrete terms, replacing dict = read_fasta(fasta)
by dict = read_fasta(*fasta)
should do the job (* is the operator for unpacking).