I am trying to change fastq header with postfix /1 and /2 and written back as new fie. However, I got this error:
No suitable quality scores found in letter_annotations of SeqRecord
Is there any way to solve this problem? Do I need to modify the quality score information to match changed fastq header?
import sys
from Bio.Seq import Seq
from Bio import SeqIO
from Bio.SeqRecord import SeqRecord
file = sys.argv[1]
final_records=[]
for seq_record in SeqIO.parse(file, "fastq"):
print seq_record.format("fastq")
#read header
header =seq_record.id
#add /1 at the end
header ="{0}/1".format(header)
# print(repr(seq_record.seq))
record = SeqRecord(seq_record.seq,id=header,description=seq_record.description)
final_records.append(record)
SeqIO.write(final_records, "my_example.fastq", "fastq")
You're getting the error because your new sequences don't have quality scores. You could transfer the quality scores from the input sequences:
record.letter_annotations["phred_quality"]=seq_record.letter_annotations["phred_quality"]
It's probably easier to just modify the ids of the original sequences and write them to the output file tho:
seq_record.id = header
final_records.append(seq_record)