Search code examples
pythonfasta

Im getting error while tring to read .fasta file in python


im trying to read a .fasta file as a dictionary and extract the header and sequence separately.there are several headers and sequences in the file. an example below.

header= CMP12
sequence=agcgtmmnngucnncttsckkld

but when i try to read a fasta file using the function read_f and test it using print(dict.keys()) i get an empty list.

def read_f(fasta):
    '''Read a file from a FASTA format'''

    dictionary = {}
    with open(fasta) as file:
        text = file.readlines()
        print(text)

    name=''
    seq= ''
    #Create blocks of fasta text for each sequence, EXCEPT the last one
    for line in text:
        if line[0]=='>':
            dictionary[name] = seq
            name=line[1:].strip()
            seq=''

        else: seq = seq + line.strip()
    yield name,seq


fasta= ("sample.prot.fasta")
dict = read_f(fasta)

print(dict.keys())

this is the error i get:

'generator' object has no attribute 'keys'

Solution

  • Using the yield keyword implies that when you call the function read_fasta, the function is not executed. Instead, a generator is returned and you have to iterate this generator to get the elements the function yields.
    In concrete terms, replacing dict = read_fasta(fasta) by dict = read_fasta(*fasta) should do the job (* is the operator for unpacking).