Search code examples
python-3.xalgorithmgraphtopological-sort

Alien Dictionary Python


Alien Dictionary

Link to the online judge -> LINK

Given a sorted dictionary of an alien language having N words and k starting alphabets of standard dictionary. Find the order of characters in the alien language. Note: Many orders may be possible for a particular test case, thus you may return any valid order and output will be 1 if the order of string returned by the function is correct else 0 denoting incorrect string returned.

Example 1:

Input: 
N = 5, K = 4
dict = {"baa","abcd","abca","cab","cad"}
Output:
1
Explanation:
Here order of characters is 
'b', 'd', 'a', 'c' Note that words are sorted 
and in the given language "baa" comes before 
"abcd", therefore 'b' is before 'a' in output.
Similarly we can find other orders.

My working code:

 from collections import defaultdict
class Solution:

    def __init__(self):
        self.vertList = defaultdict(list)

    def addEdge(self,u,v):
        self.vertList[u].append(v)
    
    def topologicalSortDFS(self,givenV,visited,stack):
        visited.add(givenV)

        for nbr in self.vertList[givenV]:
            if nbr not in visited:
                self.topologicalSortDFS(nbr,visited,stack)
        stack.append(givenV)

    def findOrder(self,dict, N, K):
        list1 = dict
        for i in range(len(list1)-1):
            word1 = list1[i]
            word2 = list1[i+1]
            rangej = min(len(word1),len(word2))
            for j in range(rangej):
                if word1[j] != word2[j]:
                    u = word1[j]
                    v = word2[j]
                    self.addEdge(u,v)
                    break
        stack = []
        visited = set()
        vlist = [v for v in self.vertList]
        for v in vlist:
            if v not in visited:
                self.topologicalSortDFS(v,visited,stack)
        result = " ".join(stack[::-1])

        return result
                
                
#{ 
#  Driver Code Starts
#Initial Template for Python 3

class sort_by_order:
    def __init__(self,s):
        self.priority = {}
        for i in range(len(s)):
            self.priority[s[i]] = i
    
    def transform(self,word):
        new_word = ''
        for c in word:
            new_word += chr( ord('a') + self.priority[c] )
        return new_word
    
    def sort_this_list(self,lst):
        lst.sort(key = self.transform)

if __name__ == '__main__':
    t=int(input())
    for _ in range(t):
        line=input().strip().split()
        n=int(line[0])
        k=int(line[1])
        
        alien_dict = [x for x in input().strip().split()]
        duplicate_dict = alien_dict.copy()
        ob=Solution()
        order = ob.findOrder(alien_dict,n,k)
        
        x = sort_by_order(order)
        x.sort_this_list(duplicate_dict)
        
        if duplicate_dict == alien_dict:
            print(1)
        else:
            print(0)

My problem:

The code runs fine for the test cases that are given in the example but fails for ["baa", "abcd", "abca", "cab", "cad"]

It throws the following error for this input:

Runtime Error:
Runtime ErrorTraceback (most recent call last):
  File "/home/e2beefe97937f518a410813879a35789.py", line 73, in <module>
    x.sort_this_list(duplicate_dict)
  File "/home/e2beefe97937f518a410813879a35789.py", line 58, in sort_this_list
    lst.sort(key = self.transform)
  File "/home/e2beefe97937f518a410813879a35789.py", line 54, in transform
    new_word += chr( ord('a') + self.priority[c] )
KeyError: 'f'

Running in some other IDE:

If I explicitly give this input using some other IDE then the output I'm getting is b d a c


Solution

  • Interesting problem. Your idea is correct, it is a partially ordered set you can build a directed acyclcic graph and find an ordered list of vertices using topological sort.

    The reason for your program to fail is because not all the letters that possibly some letters will not be added to your vertList.

    Spoiler: adding the following line somewhere in your code solves the issue

    vlist = [chr(ord('a') + v) for v in range(K)]

    A simple failing example

    Consider the input

    2 4
    baa abd
    

    This will determine the following vertList

    {"b": ["a"]}
    

    The only constraint is that b must come before a in this alphabet. Your code returns the alphabet b a, since the letter d is not present you the driver code will produce an error when trying to check your solution. In my opinion it should simply output 0 in this situation.