Search code examples
pythonlistbijection

Find possible bijection between characters and digits


Let's say you have a string S and a sequence of digits in a list L such that len(S) = len(L).

What would be the cleanest way of checking if you can find a bijection between the characters of the string to the digits in the sequence such that each character matches to one and only one digit.

For example, "aabbcc" should match with 115522 but not 123456 or 111111.

I have a complex setup with two dicts and loop, but I'm wondering if there's a clean way of doing this, maybe by using some function from the Python libraries.


Solution

  • I would use a set for this:

    In [9]: set("aabbcc")
    Out[9]: set(['a', 'c', 'b'])
    
    In [10]: set(zip("aabbcc", [1, 1, 5, 5, 2, 2]))
    Out[10]: set([('a', 1), ('c', 2), ('b', 5)])
    

    The second set will have length equal to the first set if and only if the mapping is surjective. (if it is not, you will have two copies of a letter mapping to the same number in the second set, or vice versa)

    Here is code that implements the idea

    def is_bijection(seq1, seq2):
        distinct1 = set(seq1)
        distinct2 = set(seq2)
        distinctMappings = set(zip(seq1, seq2))
        return len(distinct1) == len(distinct2) == len(distinctMappings)
    

    This will also return true if one sequence is shorter than the other, but a valid mapping has already been established. If the sequences must be the same length, you should add a check for that.