I've written a program to benchmark two ways of finding "the longest Collatz chain for integers less than some bound".
The first way is with "backtrack memoization" which keeps track of the current chain from start till hash table collision (in a stack) and then pops all the values into the hash table (with incrementing chain length values).
The second way is with simpler memoization that only memoizes the starting value of the chain.
To my surprise and confusion, the algorithm that memoizes the entirety of the sub-chain up until the first collision is consistently slower than the algorithm which only memoizes the starting value.
I'm wondering if this is due to one of the following factors:
Is Python really slow with stacks? Enough that it offsets performance gains
Is my code/algorithm bad?
Is it simply the case that, statistically, as integers grow large, the time spent revisiting the non-memoized elements of previously calculated Collatz chains/sub-chains is asymptotically minimal, to the point that any overhead due to popping elements off a stack simply isn't worth the gains?
In short, I'm wondering if this unexpected result is due to the language, the code, or math (i.e. the statistics of Collatz).
import time
def results(backtrackMemoization, start, maxChainValue, collatzDict):
print()
print(("with " if backtrackMemoization else "without ") + "backtracking memoization")
print("length of " + str(collatzDict[maxChainValue[0]]) + " found for n = " + str(maxChainValue[0]))
print("computed in " + str(round(time.time() - start, 3)) + " seconds")
def collatz(backtrackMemoization, start, maxChainValue, collatzDict):
for target in range(1, maxNum):
n = target
if (backtrackMemoization):
stack = []
else:
length = 0
while (n not in collatzDict):
if (backtrackMemoization):
stack.append(n)
else:
length = length + 1
if (n % 2):
n = 3 * n + 1
else:
n = n // 2
if (backtrackMemoization):
additionalLength = 1
while (len(stack) > 0):
collatzDict[stack.pop()] = collatzDict[n] + additionalLength
additionalLength = additionalLength + 1
else:
collatzDict[target] = collatzDict[n] + length
if (collatzDict[target] > collatzDict[maxChainValue[0]]):
maxChainValue[0] = target
def benchmarkAlgo(maxNum, backtrackMemoization):
start = time.time()
maxChainValue = [1]
collatzDict = {1:0}
collatz(backtrackMemoization, start, maxChainValue, collatzDict)
results(backtrackMemoization, start, maxChainValue, collatzDict)
try:
maxNum = int(input("enter upper bound> "))
print("setting upper bound to " + str(maxNum))
except:
maxNum = 100000
print("defaulting upper bound to " + str(maxNum))
benchmarkAlgo(maxNum, True)
benchmarkAlgo(maxNum, False)
There is a tradeoff in your code. Without the backtrack memoization, dictionary lookups will miss about twice as many times as when you use it. For example, if maxNum = 1,000,000
then the number of missed dictionary lookups is
On the other hand, with backtrack memoization, you are constructing a much bigger dictionary since you are collecting lengths of chains not only for the target values, but also for any value that is encountered in the middle of a chain. Here is the final length of collatzDict
for maxNum = 1,000,000
:
There is a cost of writing to this dictionary that many more times, popping all these additional values from the stack, etc. It seems that in the end, this cost outweighs the benefits of reducing dictionary lookup misses. In my tests, the code with backtrack memoization run about 20% slower.
It is possible to optimize backtrack memoization, to keep the dictionary lookup misses low while reducing the cost of constructing the dictionary:
(n, i)
where n
is as in your code, and i
is the length of the chain traversed up to this point (i.e. i
is incremented at every iteration of the while
loop). Such a tuple is put on the stack only if n < maxNum
. In addition, keep track of how long the whole chain gets before you find a value that is already in the dictionary (i.e. of the total number of iterations of the while
loop).The dictionary obtained in this way will be exactly the same as the one constructed without backtrack memoization, but it will be built in a more efficient way, since a key n
will be added when it is first encountered. For this reason, dictionary lookup misses will be still much lower than without backtrack memoization. Here are the numbers of misses I obtained for maxNum = 1,000,000
:
For larger values of maxNum
the optimized code should run faster than without backtrack memoization. In my tests it was about 25% faster for maxNum >= 1,000,000
.