Search code examples
pythondictionaryrecursiondynamic-programmingmemoization

Efficiency of dictionaries in python:


In the recursive function below, I've applied some memoization techniques in python to save previous calculate values in a dictionary, which in theory should have storing and retrieval of O(1). However, the execution time in about three times more when using memoization than simple recursion.

In the two pieces of the code below, what would be the possible cause for such higher execution time in the second code? Am I using dictionaries the wrong way?

Simple recursive function:

import time

def deletion_distance(str1, str2):
  if str1 is "":
    return len(str2)
  elif str2 is "":
    return len(str1)
  else:
    if str1[len(str1) - 1] == str2[len(str2) - 1]:
      return deletion_distance(str1[:-1],str2[:-1])
    else:
      return 1 + min(deletion_distance(str1, str2[:-1]),deletion_distance(str1[:-1],str2))

str1 = "dragonified"
str2 = "infinitezimal"

start = time.time()
for i in range(0,2):
    deletion_distance(str1,str2)
end = time.time()

print((end - start) / 2)

Dynamic programming using dictionaries for memoization:

import time

def deletion_distance(str1, str2):
  global aux_dict

  if str1 is "":
    return len(str2)
  elif str2 is "":
    return len(str1)
  else:
    if str1[len(str1) - 1] == str2[len(str2) - 1]:
      if "1"+str1[:-1]+"2"+str2[:-1] in aux_dict:
        return aux_dict["1"+str1[:-1]+"2"+str2[:-1]]
      else:
        aux_dict["1"+str1[:-1]+"2"+str2[:-1]] = deletion_distance(str1[:-1],str2[:-1])
        return aux_dict["1"+str1[:-1]+"2"+str2[:-1]]
    else:

      return 1 + min(
        aux_dict["1"+str1+"2"+str2[:-1]] if "1"+str1+"2"+str2[:-1] in aux_dict else deletion_distance(str1, str2[:-1]),
        aux_dict["1"+str1[:-1]+"2"+str2] if "1"+str1[:-1]+"2"+str2 in aux_dict else deletion_distance(str1[:-1],str2))

aux_dict = {}      
str1 = "dragonified"
str2 = "infinitezimal"

start = time.time()
for i in range(0,2):
    deletion_distance(str1,str2)
end = time.time()

print((end - start) / 2)

Both functions calculate the deletion distance of two strings (PRAMP.COM question), which is simply the minimum number of character of both strings that much be removed from the two strings so that they become the same.


Solution

  • You don't use the dictionary at all, because you use a new empty dictionary for each function call.

    aux_dict = {}
    
    def deletion_distance(str1, str2):
        if (str1, str2) in aux_dict:
            return aux_dict[str1, str2]
        if not str1:
            return len(str2)
        elif not str2:
            return len(str1)
        elif str1[-1] == str2[-1]:
            result = deletion_distance(str1[:-1],str2[:-1])
        else:
            result = 1 + min(
                deletion_distance(str1, str2[:-1]),
                deletion_distance(str1[:-1], str2),
            )
        aux_dict[str1, str2] = result
        return result
    

    which reduces the amount of calls for the two example strings from 1685178 to 268, and therefore the cached version is 3000 times faster.

    EDIT: in your updated question, you still don't use the dictionary correctly. Only in the case, that the last character of both strings are equal, the dictionary is used.