Efficient method for finding Sudoku Hidden Singles in Python

I am writing a sudoku solver with human-like tactics. I wrote a couple of methods for finding hidden singles on a box, that is, to check if there is only one cell available for a given candidate to be placed within the box ("hidden" means even if the cell itself has more candidates).

I use the following structure for the lists: the board[9][9] stores the placed numbers from 1 to 9, and 0 if there is no placed number. the possibles[9][9][9] stores the candidates for a given cell, and 0 if the candidate is already eliminated. Since I am also writing the GUI with Pygame, I prefer not to remove elements from the possibles, thus, if the cell has only the number 5 as candidate, the possible list would be possibles[i][j] = [0, 0, 0, 0, 5, 0, 0, 0, 0].

Here's the hidden single method:

def hidden_singles(board, possibles):
    # CHECK BOX
    print("method 1")
    hiddenSinglesPos_box = []
    t0 = time.time()
    iterations = 0
    # iterate over each cell in the board
    for i, row in enumerate(board):
        for j, bnum in enumerate(row):
            # if the cell already has a placed number, skip
            if bnum != 0:
                continue

            # get box index
            ii = 3 * (i // 3)
            jj = 3 * (j // 3)

            # reduce the candidate list of cell to only non-zeros
            p = [_ for _ in possibles[i][j] if _ != 0]

            # for each candidate, check the box for a hidden single !?!?
            for k, pnum in enumerate(p):
                count = 0
                for ibox in range(ii, ii + 3):
                    for jbox in range(jj, jj + 3):
                        # skip again if the cell within the box has a placed number
                        if board[ibox][jbox] != 0:
                            continue
                        iterations += 1
                        if possibles[ibox][jbox][pnum - 1] != 0:
                            count += 1
                if count == 1:
                    hiddenSinglesPos_box.append([i, j, pnum])
    deltaT = time.time() - t0
    print(deltaT)
    print(f"Iterations: {iterations}")
    print(f"{len(hiddenSinglesPos_box)} hidden singles")
    print(hiddenSinglesPos_box)

It is worth mentioning that, before calling this method, I have already eliminated the obvious non-candidates by checking row, col and box.

This works and it finds the hidden singles with ~1000 iterations, but surely it can be improved. I noticed that I could remove a candidate from the box search as soon as the second match is found. For instance, if the first cell in the box has the candidates [1, 2, 3] and the second cell has [1, 2, 4], there is no need to check candidates 1 and 2 for the second cell (I do not know how to do that without overcomplicating everything though). I do visit every board cell because that is the way I found to keep track not only of the existence of hidden single in a box but also its coordinate.

I am a beginner in Python and in coding too, so I am accepting suggestions about this method or about the structure in general and how I store the board, possibles, etc.

Solution

You have an interesting problem. To get "less" iterations, use abstraction like sets. This makes interpreter call compiled routines (mostly written in c) instead of parsing complex python logic.

board = [
    [0, 1, 0, 0, 0, 0, 0, 0, 0],
    [0, 2, 0, 0, 0, 0, 0, 0, 0],
    [0, 3, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 7, 0, 0, 0, 0, 0, 0],
    [0, 0, 8, 6, 5, 4, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0]
]

fullSet = list(range(1, len(board)+1))
zero = set([0])

def findPossibles(board):
    result = [None] * len(board)
    for y in range(len(board)):
        result[y] = [None] * len(board[0])
        for x in range(len(board[0])):
            result[y][x] = fullSet.copy()

    for y, row in enumerate(result):
        for x, e in enumerate(row):
            if board[y][x] != 0:
                e = [0] * 9
            for i in range(len(board)):
                num = board[i][x]
                if num != 0: e[num-1] = 0
            for i in range(len(board[0])):
                num = board[y][i]
                if num != 0: e[num-1] = 0
    return result

def findSingles(board, possibles):
    result = []

    for y in range(3):
        for x in range(3):
            taken = set()
            # collect quadrant values
            for iy in range(y*3, y*3+3):
                for ix in range(x*3, x*3+3):
                    taken.add(board[iy][ix])

            # find hidden singles
            for iy in range(y*3, y*3+3):
                for ix in range(x*3, x*3+3):
                    res = set(possibles[iy][ix]) - taken - zero
                    if len(res) == 1:
                        result.append((ix, iy, list(res)[0]))
    return result

import time

t0 = time.time()
possibles = findPossibles(board)
singles = findSingles(board, possibles)
print(time.time() - t0)
print(singles) # [(1, 7, 9)]

The code handling hidden singles should make around 162 iterations.

Edit

I came up with this algorithm, using dictionary to collect all possible positions for a number inside a quadrant. Then the ones that have only one possible position are collected. I also improved outer complexity of findPossibles and replaces lists with sets. We can discus code further in comments.

board = [
    [9, 1, 0, 0, 0, 0, 0, 0, 0],
    [0, 2, 0, 0, 0, 0, 0, 0, 0],
    [0, 3, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 9, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 7, 9, 0, 0, 0, 0, 0],
    [0, 0, 8, 6, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 9, 0, 0]
]

fullSet = set(range(1, len(board)+1))

def findPossibles(board):
    sz = len(board)
    br = range(sz)
    result = [None] * sz
    for y in br:
        result[y] = [None] * sz
        for x in br:
            result[y][x] = fullSet.copy()

    for y, row in enumerate(board):
        taken = set(row)
        for x in br:
            result[y][x] -= taken

    for x in br:
        taken = set()
        for y in br:
            taken.add(board[y][x])
        for y in br:
            result[y][x] -= taken
    
    for y in range(3):
        for x in range(3):
            taken = set()
            rx = x * 3
            ry = y * 3
            
            for iy in range(ry, ry+3):
                for ix in range(rx, rx+3):
                    taken.add(board[iy][ix])

            for iy in range(ry, ry+3):
                for ix in range(rx, rx+3):
                    result[iy][ix] -= taken

    return result

def findSingles(possibles):
    result = []
    for y in range(3):
        for x in range(3):
            table = {i:[] for i in fullSet}
            rx = x * 3
            ry = y * 3

            for iy in range(ry, ry+3):
                for ix in range(rx, rx+3):
                    for p in possibles[iy][ix]:
                        table[p].append((ix, iy))

            for p in table:
                if len(table[p]) == 1:
                    result.append(table[p][0] + (p,))

    return result

import time

t0 = time.time()
possibles = findPossibles(board)
singles = findSingles(possibles)
print(time.time() - t0)
print(singles) # [(1, 7, 9)]

Edit 2

To improve the method findPossibles, we should assign an empty set for the candidates in cells with already placed numbers.

def findPossibles(board):
    sz = len(board)
    br = range(sz)
    result = [None] * sz
    for y in br:
        result[y] = [None] * sz
        for x in be:
            # check if this board cell is empty
            if board[y][x] == 0:
                result[y][x] = fullSet.copy()
            else:
                result[y][x] = set()