Search code examples
pythonsequencebiopythondna-sequence

Calculate the mean of amino acids sequences biopython


I have this code for calculating the length of sequences in fasta format using BioPython. I got the lenghts.

NP_418305.1

349

NP_418306.1

469

NP_418308.1

236

However, now I'd like to calcule the mean of the whole sequences, just like an intereting fact that I can add to my research. Will be great to get some advices.

from Bio import SeqIO

record_dict = SeqIO.to_dict(SeqIO.parse("aminoacids.txt", "fasta"))

for key in record_dict.items():

print(key[0],"\n ",len(key[1].seq))


Solution

  • I was able to get the mean of total length by summing each length of every sequence and dividing by the total of sequences.