Search code examples
pythonocrpermutation

Permutations and combination in python


I'm working on an OCR use case and have identified common misclassification from the confusion matrix which is for example: '1' being confused for 'J' and '2' being confused with 'Z' and 'J'.

For a given word, I am trying to create a python script which would create all the permutations which account for all the misclassification.

Example:

  • Common Misclassifications: {'1':['J'],'2':['Z','J']}
  • Input: "AB1CD2"
  • Output: AB1CD2, AB1CDZ, ABJCD2, ABJCDZ, AB1CDJ, ABJCDJ

How do I go about solving this?


Solution

  • You get a neat solution by using a dictionary of all possible classifications, not just all mis-classifications. That is, you first "enrich" your misclassification dictionary with all possible correct classifications.

    from itertools import product
    
    all_characters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
    common_misclass = {'1':['J'],'2':['Z','J']}
    input_string = "AB1CD2"
    
    common_class = {}
    for char in all_characters:
        if char in common_misclass:
            common_class[char] = [char] + common_misclass[char]
        else:
            common_class[char] = [char]
    
    possible_outputs = ["".join(tup) for tup in 
        product(*[common_class[letter] for letter in input_string])]
    
    print(possible_outputs)