Search code examples
pythonfunctionsum

Why is the "1" after sum necessary to avoid a syntax error


Why does this work:

def hamming_distance(dna_1,dna_2):
    hamming_distance = sum(1 for a, b in zip(dna_1, dna_2) if a != b)
    return hamming_distance

As opposed to this:

def hamming_distance(dna_1,dna_2):
    hamming_distance = sum(for a, b in zip(dna_1, dna_2) if a != b)
    return hamming_distance

I get this error:

 Input In [90]
    hamming_distance = sum(for a, b in zip(dna_1, dna_2) if a != b)
                           ^
SyntaxError: invalid syntax

I expected the function to work without the 1 after the ()


Solution

  • The working expression can be unrolled into something like this:

    hamming_distance = 0
    for a, b in zip(dna_1, dna_2):
        if a != b:
            hamming_distance += 1
    

    Without a number after +=, what should Python add? It doesn't know, and neither do we.

    If this "unrolled" syntax or your code's relationship to it is new to you, probably start by reading up on list comprehensions, which generalize into generator expressions (which is what you have).