For example, ACTTTA occurs twice in ACTTACTTGATAAAGT, once at offset 0 with 2 mismatches, and once at offset 4 with 1 mismatch. So naive_2mm('ACTTTA', 'ACTTACTTGATAAAGT') should return the list |[0, 4].
I'm still a newbie. I've been working on this problem for almost a week now and I can't figure this out on my own. This is the code that I develop. Can someone explain why this code does not work and how can I do this?
def naive_2mm(p,t):
occurences = []
counter = 0
for i in range(len(t)-len(p)+1):
while counter != (len(p)-2):
for j in range(len(p)):
if t[i+j] == p[j]:
counter += 1
continue
else:
occurences.append(i)
return occurences
Your code is pretty close. I think the counter initialization is just about the only issue. This works:
def naive_2mm(p,t):
occurrences = []
for i in range(len(t)-len(p)+1):
counter = 0
for j,k in zip(p, t[i:i+len(p)]):
counter += j==k
if counter >= len(p)-2:
occurrences.append(i)
return occurrences
needle = 'ACTTTA'
haystack = 'ACTTACTTGATAAAGT'
print(naive_2mm(needle,haystack))
Or, for those of you who like one-liners, you can replace the 3-line sequence creating counter
with:
counter = sum(j==k for j,k in zip(p, t[i:i+len(p)]))