Search code examples
pythonstringexpansionkarnaugh-map

Creating a list of possibilities from splitting strings at a certain character - Karnaugh Map


I am trying to create "truth tables" for use in a Karnaugh map solver. The Karnaugh Map solver takes in the settings of the variables ABCD as a string of 4 numbers representing what they are set to. For example, ABCD could be anything from "0000" to "1111." The solver takes in the minterms (settings of ABCD which equal 1 in the truth table), but it outputs "*000" or other examples with * to represent that the Kmap has removed the first variable in the resulting equation. I am using this for an iterative process, so I need to be able to convert lists of these asterix-filled strings to a list of fully expanded strings :

[*000, 0*00] -> [0000, 1000, 0100]

The way I am currently doing it (with while loops) works but is very slow for the application I am using it for. It takes at least 10 minutes to get through, because I need to do this process around 1000 times (one for each piece of the dataset I am using) It goes quickly until around 430, when it slows significantly because of how long it takes to expand these complicated sequences. I can also confirm that it is not the Karnaugh map implementation I am using, because that solves the same long sequences instantly.

I'm not sure of a more pythonic way to do this which also accounts for things like "*11*", which have multiple spots where the expression needs to be expanded:

[*11*] -> [011*, 111*] -> [0110, 0111, 1110, 1111]

Any ideas which are efficient and pythonic?

My Code:

        ast = True
        while ast:
            new_string_result = copy.deepcopy(string_result)
            for i in range(len(string_result)):
                for c, char in enumerate(string_result[i]):
                    if char == "*":
                        # replace with 0 and 1
                        new_string_result.append(string_result[i][:c] + "0" + string_result[i][c+1:])
                        new_string_result.append(string_result[i][:c] + "1" + string_result[i][c+1:])

                if "*" in string_result[i]:    
                    # remove original (only once, even if multiple *)
                    # print("Removing ", string_result[i])
                    new_string_result.remove(string_result[i])
                    
            # print("Kmap result during fix iter: ", new_string_result)
            
            ast_found = False
            for i in range(len(new_string_result)):
                if "*" in new_string_result[i]:
                    ast_found = True
            
            # print(ast_found)
            ast = ast_found
            string_result = new_string_result

Solution

  • I found a couple of ideas that are 2x-3x faster.

    Reference code

    This is the code from your original question, with a break statement added to remove duplicates from the output. (If you don't do this, then expanding **** gives 384 results.)

    def kmap_orig(string_result):
        ast = True
        while ast:
            new_string_result = copy.deepcopy(string_result)
            for i in range(len(string_result)):
                for c, char in enumerate(string_result[i]):
                    if char == "*":
                        # replace with 0 and 1
                        new_string_result.append(string_result[i][:c] + "0" + string_result[i][c+1:])
                        new_string_result.append(string_result[i][:c] + "1" + string_result[i][c+1:])
                        break
    
                if "*" in string_result[i]:    
                    # remove original (only once, even if multiple *)
                    # print("Removing ", string_result[i])
                    new_string_result.remove(string_result[i])
    
            # print("Kmap result during fix iter: ", new_string_result)
    
            ast_found = False
            for i in range(len(new_string_result)):
                if "*" in new_string_result[i]:
                    ast_found = True
    
            # print(ast_found)
            ast = ast_found
            string_result = new_string_result
        return string_result
    

    Some comments on this code:

    • list.remove() is slow. In order to remove an element from a list, it must search through each element of the list, and check if they are equal. Once it finds the match, it has to move every element of the list after that point down one space. Avoid this if possible.
    • You're iterating over the list with for i in range(len(string_result)):, but the variable i is not used for anything besides string_result[i]. This is usually slower than for elem in string_result:, and more complex.

    Test data

    Here's the data that I used to test each of these. It is ten thousand random strings of *, 0, or 1, each ten characters long.

    data = [''.join(random.choices(['0', '1', '*'], k=10)) for i in range(10000)]
    correct_output = [set(kmap_orig([string])) for string in data]
    

    (Note: when checking the correctness of all of the solutions below, I ignored duplicates and order of solutions.)

    Iterative solution

    This one is about 3x faster, mostly from the removal of the inner for loop.

    def kmap_iterative(string_result):
        changed = True
        while changed:
            changed = False
            new_string_result = []
            for string in string_result:
                if "*" in string:
                    c = string.find("*")
                    assert c != -1
                    # replace with 0 and 1
                    prefix = string[:c]
                    suffix = string[c+1:]
                    new_string_result.append(prefix + "0" + suffix)
                    new_string_result.append(prefix + "1" + suffix)
                    changed = True
                else:
                    # no * is present, so add unchanged
                    new_string_result.append(string)
            string_result = new_string_result
        return string_result
    

    Other optimizations done:

    • Rather than find string[:c] twice, do it once and save to a variable.
    • The code spent a lot of time finding the value of ast_found. It is faster to keep track of whether any modifications were done to the list, and exit if no modifications were made. The downside is that it loops one more time than necessary.
    • Since the inner loop is essentially just looking for the first instance of * in the string, replace it with a function from the standard library which does the same thing.
    • Avoid copy.deepcopy() and list.remove() by starting with an empty list, and adding elements from string_result which don't have any asterisks.

    Recursive solution

    This is about 20% slower than the iterative solution, but it is the shortest and (in my opinion) simplest solution.

    def expand_string_recursive(string):
        if "*" not in string:
            yield string
        else:
            c = string.find("*")
            assert c != -1
            # replace with 0 and 1
            prefix = string[:c]
            suffix = string[c+1:]
            yield from expand_string_recursive(prefix + "0" + suffix)
            yield from expand_string_recursive(prefix + "1" + suffix)
    
    def kmap_recursive(string_result):
        ret = []
        for string in string_result:
            ret.extend(expand_string_recursive(string))
        return ret
    

    Itertools solution

    The idea behind this solution is that what you're asking for is a Cartesian product of [0, 1], done n times, where n is the number of asterisks. There's something in the standard library which can find this. This is also 20% slower than the iterative solution.

    def expand_string_itertools(string):
        blank_positions = [i for i, letter in enumerate(string) if letter == "*"]
        blanks = len(blank_positions)
        # Convert to list to allow mutability
        string = list(string)
        for product in itertools.product(["0", "1"], repeat=blanks):
            for position, replacement in zip(blank_positions, product):
                string[position] = replacement
            # Convert back to string
            yield ''.join(string)
    
    def kmap_itertools(string_result):
        ret = []
        for string in string_result:
            ret.extend(expand_string_itertools(string))
        return ret
    

    Benchmark results

    kmap_orig
    706 ms ± 5.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    kmap_iterative
    201 ms ± 1.06 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    kmap_recursive
    251 ms ± 2.33 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    kmap_itertools
    243 ms ± 4.74 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)