Search code examples
pythonpython-3.xrecursiondata-structuresrecursive-datastructures

Inconsistent output in Recursion?


I was solving a question today on Leetcode named Lexicographically Smallest Equivalent String (link) and I came up with a solution of DFS with some customization to solve the problem and here's is my code.

#!/usr/bin/python3

from collections import defaultdict, OrderedDict

class Solution:
    def DFSMap(self, node, map, adj, visited):
        if node not in visited:
            visited.add(node)
            for i in adj[node]:
                map[node]=min(map[node] or "z", node, self.DFSMap(i, map, adj, visited))
            return map[node]
        else:
            return map[node] or node

    def smallestEquivalentString(self, s1: str, s2: str, baseStr: str) -> str:
        referenceMap = defaultdict(set)
        d=OrderedDict()
        n=len(s1)
        for i in range(n):
            referenceMap[s2[i]].add(s1[i])
            referenceMap[s1[i]].add(s2[i])
        for j in sorted(referenceMap.items(),key=lambda x:x[0]):
            d[j[0]]=j[1]

        m=defaultdict(lambda:None)
        visited=set()
        for i in d:
            if i not in visited:
                self.DFSMap(i,m,d,visited)

        res=""
        print(m)
        for i in baseStr:
            res+=(m[i] or i)
        return res


if __name__ == "__main__":
    s1=input()
    s2=input()
    baseStr=input()
    s=Solution().smallestEquivalentString(s1, s2, baseStr)
    print(s)

The issue I'm facing is, for every run of the same code, I get a different output of hashmap without usage of random functionalities in my code. Here are few snippets of the outputs.

On other run I get,

Is it the issue with recursion or any logical error?

Could someone help me figure out the reason for this with an explanation?

The output for the input s1 = "leetcode", s2 = "programs", baseStr = "sourcecode" was supposed to be "aauaaaaada" but that isn't happening certainly and consistently.


Solution

  • This is caused by the non-determined order in which a set is iterated. And this impacts in which order for i in adj[node] is iterating.

    For the example input you provided, the following edges are defined that relate to the letter "a" (which is the first letter for which the DFS search is performed):

                r ─── c 
                │
    a ─── o ─── e 
                │
                s
    

    The adjacency set for "a" has only one entry ("o") so no surprises there, but "o" has two neighbors, and "e" three, ...etc. The order affects the result.

    Let's assume this order (adding assignments to map keys as they occur):

     DFSMap("a")
       DFSMap("o")
         DFSMap("a")
       map["o"] = "a"
         DFSMap("e")
           DFSMap("r")
             DFSMap("e")
           map["r"] = "e"
             DFSMap("c")
               DFSMap("r")
             map["c"] = "c"
           map["r"] = "c"
         map["e"] = "c"
           DFSMap("s")
             DFSMap("e")
           map["s"] = "c"
         map["e"] = "c"
           DFSMap("o")
         map["e"] = "a"
       map["o"] = "a"
     map["a"] = "a"
    

    Note how not all "nodes" in this connected component get associated with the value "a". In this particular traversal, the letters "c", "r" and "s" remain associated with "c", instead of "a". In another order of traversal the anomalies could be elsewhere.

    The random results illustrate that the algorithm is not correct. Typically when the lexical minimum letter is visited first before lots of other connected nodes, the results will incorrect, as this least letter is not propagated to nodes that are further away.

    Here is a correction of your dfs-code. It takes an extra least argument to allow for that propagation:

        def DFSMap(self, node, mp, adj, visited, least="z"):
            least = min(least, mp[node] or node)
            if node not in visited:
                visited.add(node)
                for i in adj[node]:
                    least = self.DFSMap(i, mp, adj, visited, least)
            mp[node] = least
            return least
    

    NB: renamed map to mp, as the first is already used by Python for a native function.

    NB: This algorithm relies on the fact that the main loop in smallestEquivalentString visits the letters in lexical order (which is the case), so that the first visited letter in a connected component is also the least in that component.