Search code examples
pythonlistbioinformaticscycle

Generating a Cyclic Spectrum


I'm trying to generate a spectrum of all the cyclic combinations from my peptide.

Here's my the code for the linear spectrum:

#For example: LEQN
#L:113 E:129 Q:128 N:114
    peptide = [113,129,128,114]
        for a in peptide:
            for i in peptide[b:]:
                s+= i
                spectrum.append(s)
            s=0
            b += 1

spectrum.sort()
print spectrum

Outputs: [113, 114, 128, 129, 242, 242, 257, 370, 371, 484]

My code successful adds these sums L(113), E(129), Q(128), N(114), LE(113+129), LEQ(113+129+128), LEQN(113+129+128+114), EQ(129+128), EQN(129+128+114), QN(128+114)

BUT is missing QNL(128+113+114), NL(114+113), NLE(114+113+129)

Ex. QNL should be 128+114+113 which is the summation of elements 2, 3, and 1. NL is 114+133 which is the summation of elements 3 and 0. And NLE is 113+114+129 which is the summation of elements 3, 0, 1.

*I wouldn't need to add EQNL or QNLE because they are exact same thing as LEQN.

*However LE=242 and QN=242 have the same mass but are NOT the same thing.

Expected Output: 113, 114, 128, 129, 227(N+L), 242, 242, 257, 355(Q+N+L), 356(N+L+E), 370, 371, 484


Solution

  • If I understand your question correctly, you want all the possible sublists, for each length up to the length of the peptide list and for each starting position in that list, wrapping around at the end of the list. One way to do this would be using cycle and islice from itertools.

    from itertools import cycle, islice
    
    peptide = [113, 129, 128, 114]
    spectrum = []
    for num in range(1, len(peptide)):
        for start in range(len(peptide)):
            group = islice(cycle(peptide), start, start + num)
            spectrum.append(sum(group))
    spectrum.append(sum(peptide)) # add the entire list only once
    

    This way, sorted(spectrum) ends up as [113, 114, 128, 129, 227, 242, 242, 257, 355, 356, 370, 371, 484], which seems to be what you wanted.

    Not sure, though, how this scales to much longer lists of peptides (I'd assume in practice those lists have more than four elements).