I have been trying to take information from a genbank file, and print out just the locus tag and translation using the below code by xbello which I modified.
from Bio import SeqIO
for rec in SeqIO.parse("file.gb", "genbank"):
if rec.features:
for feature in rec.features:
if feature.type == "CDS" and feature.qualifiers.has_key('translation'):
print '>'+feature.qualifiers['locus_tag'][0]
print feature.qualifiers['translation'][0]
This works however it prints out each of the translation sequences as very long lines (I assume the maximum character length python allows), I was wondering if it was possible to set it so that they would be formatted into multi-line paragraphs with about 60 characters a line, which is what you often seen in .faa files for example.
I have tried print(textwrap.fill(feature.qualifiers['translation'], width=60))
and print(textwrap.wrap(feature.qualifiers['translation'], width=60))
So far that has not worked, I have tried doing
X = feature.qualifiers['translation']
and doing print(textwrap.fill(X, width=60))
But unsurprisingly the computer had no idea what I was asking it to do…
I am not sure what other format commands work with print instead of Xout.write
, I have a strong feeling I have not written this in a way that lets the computer know I want it to wait for the text from print feature.qualifiers['translation']
and then text wrap that with a width=60
I use cmd or powershell to run this code as a script, with ">X.xx" to give the output file name and file type.
You could write a custom print function which gets as input a string and splits the string into parts of 60 char and then prints those parts.
def custom_print(string):
counter=0
res=""
for char in string:
if counter==60:
print res
counter=0
res=""
continue
res+=char
counter+=1