Counting differences between two strings

You could do this pretty flatly with a generator expression

count = sum(1 for a, b in zip(seq1, seq2) if a != b)

If the sequences are of a different length, then you may consider the difference in length to be difference in content (I would). In that case, tag on an extra piece to account for it

count = sum(1 for a, b in zip(seq1, seq2) if a != b) + abs(len(seq1) - len(seq2))

Another weirdish way to write that which takes advantage of True being 1 and False being 0 is:

sum(a != b for a, b in zip(seq1, seq2))+ abs(len(seq1) - len(seq2))

zip is a python builtin that allows you to iterate over two sequences at once. It will also terminate on the shortest sequence, observe:

>>> seq1 = 'hi'
>>> seq2 = 'world'
>>> for a, b in zip(seq1, seq2):
...     print('a =', a, '| b =', b)
... 
a = h | b = w
a = i | b = o

This will evaluate similar to sum([1, 1, 1]) where each 1 represents a difference between the two sequences. The if a != b filter causes the generator to only produce a value when a and b differ.


When you say for i in seq1 you are iterating over the characters, not the indexes. You can use enumerate by saying for i, ch in enumerate(seq1) instead.

Or even better, use the standard function zip to go through both sequences at once.

You also have a problem because you return before you print. Probably your return needs to be moved down and unindented.

Tags:

Python