Search code examples
biopython

Why does the 'join' method for Seq object in Biopython not work on the last element of a list?


The code below is from the Biopython tutorial. I intend to add 'N5' after every contig. Why is the trailing N10 not present after the third contig "TTGCA"?

from Bio.Seq import Seq 
contigs = [Seq("ATG"), Seq("ATCCCG"), Seq("TTGCA")] 
spacer = Seq("N"*10) 
spacer.join(contigs) 
output
Seq('ATGNNNNNNNNNNATCCCGNNNNNNNNNNTTGCA')

expected output
Seq('ATGNNNNNNNNNNATCCCGNNNNNNNNNNTTGCANNNNNNNNNN')

Doesn't the index in Python and Biopython both begin with 0?

Thank you


Solution

  • This has nothing to do with biopython.

    This is just how string.join works:

    configs = ["ATG", "ATCCCG", "TTGCA"] 
    spacer = "N"*10 
    spacer.join(configs) 
    

    Result: ATGNNNNNNNNNNATCCCGNNNNNNNNNNTTGCA

    As it should - according to help(str.join):

    join(self, iterable, /) Concatenate any number of strings.

    The string whose method is called is inserted in between each given string. The result is returned as a new string.

    Example: '.'.join(['ab', 'pq', 'rs']) -> 'ab.pq.rs'