How to create a Sudoku puzzle in Python

Goal is to create a 9x9 Sudoku matrix in Python.

So I got this far. But I cannot seem to get the program to get the interior contingent boxes correct.

def sudoku(size):
    import random as rn
    mydict = {}
    n = 0
    while len(mydict) < 9:
        n += 1
        x = range(1, size+1)
        testlist = rn.sample(x, len(x))

        isgood = True
        for dictid,savedlist in mydict.items():
            if isgood == False:
                break
            for v in savedlist:
                if testlist[savedlist.index(v)] == v:
                    isgood = False
                    break
        if isgood == True:
            #print 'success', testlist
            mydict[len(mydict)] = testlist
    return mydict, n

return_dict, total_tries = sudoku(9)
for n,v in return_dict.items():
    print n,v
print 'in',total_tries,'tries'

Solution

You can generate a random sudoku solution board where all numbers are filled in and then remove some of them to create the puzzle. This will ensure that the puzzle always has a solution. Making sure that it has exactly one solution is a bit more challenging (hint: you must leave at least 17 numbers for a 9x9 sudoku)

The algorithm below will generate a NxN random sudoku solution board instantly for N < 1000.

base  = 3
side  = base*base

# pattern for a baseline valid solution
def pattern(r,c): return (base*(r%base)+r//base+c)%side

# randomize rows, columns and numbers (of valid base pattern)
from random import sample
def shuffle(s): return sample(s,len(s)) 
rBase = range(base) 
rows  = [ g*base + r for g in shuffle(rBase) for r in shuffle(rBase) ] 
cols  = [ g*base + c for g in shuffle(rBase) for c in shuffle(rBase) ]
nums  = shuffle(range(1,base*base+1))

# produce board using randomized baseline pattern
board = [ [nums[pattern(r,c)] for c in cols] for r in rows ]

for line in board: print(line)

[6, 2, 5, 8, 4, 3, 7, 9, 1]
[7, 9, 1, 2, 6, 5, 4, 8, 3]
[4, 8, 3, 9, 7, 1, 6, 2, 5]
[8, 1, 4, 5, 9, 7, 2, 3, 6]
[2, 3, 6, 1, 8, 4, 9, 5, 7]
[9, 5, 7, 3, 2, 6, 8, 1, 4]
[5, 6, 9, 4, 3, 2, 1, 7, 8]
[3, 4, 2, 7, 1, 8, 5, 6, 9]
[1, 7, 8, 6, 5, 9, 3, 4, 2]

You can then remove some of the numbers from the sudoku solution to create the puzzle:

squares = side*side
empties = squares * 3//4
for p in sample(range(squares),empties):
    board[p//side][p%side] = 0

numSize = len(str(side))
for line in board:
    print(*(f"{n or '.':{numSize}} " for n in line))

6  .  .  .  .  3  .  .  1
.  9  .  .  .  .  .  .  3
4  .  3  .  .  .  6  .  .
.  .  .  5  9  .  2  .  6
.  .  .  .  .  .  .  .  .
.  .  7  .  .  .  .  .  4
.  .  .  .  .  .  1  7  .
.  .  2  .  .  8  .  .  .
.  .  8  .  .  .  .  4  2

For 4x4 up to 36x36 puzzles, you could make a nicer print of the board like this:

def expandLine(line):
    return line[0]+line[5:9].join([line[1:5]*(base-1)]*base)+line[9:13]
line0  = expandLine("╔═══╤═══╦═══╗")
line1  = expandLine("║ . │ . ║ . ║")
line2  = expandLine("╟───┼───╫───╢")
line3  = expandLine("╠═══╪═══╬═══╣")
line4  = expandLine("╚═══╧═══╩═══╝")

symbol = " 1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ"
nums   = [ [""]+[symbol[n] for n in row] for row in board ]
print(line0)
for r in range(1,side+1):
    print( "".join(n+s for n,s in zip(nums[r-1],line1.split("."))) )
    print([line2,line3,line4][(r%side==0)+(r%base==0)])

╔═══╤═══╤═══╦═══╤═══╤═══╦═══╤═══╤═══╗
║ 6 │   │   ║   │   │ 3 ║   │   │ 1 ║
╟───┼───┼───╫───┼───┼───╫───┼───┼───╢
║   │ 9 │   ║   │   │   ║   │   │ 3 ║
╟───┼───┼───╫───┼───┼───╫───┼───┼───╢
║ 4 │   │ 3 ║   │   │   ║ 6 │   │   ║
╠═══╪═══╪═══╬═══╪═══╪═══╬═══╪═══╪═══╣
║   │   │   ║ 5 │ 9 │   ║ 2 │   │ 6 ║
╟───┼───┼───╫───┼───┼───╫───┼───┼───╢
║   │   │   ║   │   │   ║   │   │   ║
╟───┼───┼───╫───┼───┼───╫───┼───┼───╢
║   │   │ 7 ║   │   │   ║   │   │ 4 ║
╠═══╪═══╪═══╬═══╪═══╪═══╬═══╪═══╪═══╣
║   │   │   ║   │   │   ║ 1 │ 7 │   ║
╟───┼───┼───╫───┼───┼───╫───┼───┼───╢
║   │   │ 2 ║   │   │ 8 ║   │   │   ║
╟───┼───┼───╫───┼───┼───╫───┼───┼───╢
║   │   │ 8 ║   │   │   ║   │ 4 │ 2 ║
╚═══╧═══╧═══╩═══╧═══╧═══╩═══╧═══╧═══╝

[EDIT]

Here are some additional information on the shuffling process ...

Shuffling rows is broken down in groups of 3 rows. It is ok to swap groups as a whole but we can't swap rows across groups without breaking the integrity of the blocks. (the same reasoning applies to columns)

For example:

0 [6, 2, 5,  8, 4, 3,  7, 9, 1] \                 -|
1 [7, 9, 1,  2, 6, 5,  4, 8, 3] |  group 0 -|     -| r in shuffle(rBase) 
2 [4, 8, 3,  9, 7, 1,  6, 2, 5] /           |     -|
                                            |
3 [8, 1, 4,  5, 9, 7,  2, 3, 6] \           |     -|
4 [2, 3, 6,  1, 8, 4,  9, 5, 7] |  group 1 -| *   -| r in shuffle(rBase)
5 [9, 5, 7,  3, 2, 6,  8, 1, 4] /           |     -|
                                            |
6 [5, 6, 9,  4, 3, 2,  1, 7, 8] \           |     -|
7 [3, 4, 2,  7, 1, 8,  5, 6, 9] |  group 2 -|     -| r in shuffle(rBase)
8 [1, 7, 8,  6, 5, 9,  3, 4, 2] /                 -|

                                * for g in shuffle(rBase)

We can swap groups 0,1,2 by moving all 3 of their rows at the same time:

3 [8, 1, 4,  5, 9, 7,  2, 3, 6] \           |     -|
4 [2, 3, 6,  1, 8, 4,  9, 5, 7] |  group 1 -|     -| r in shuffle(rBase)
5 [9, 5, 7,  3, 2, 6,  8, 1, 4] /           |     -|
                                            |
6 [5, 6, 9,  4, 3, 2,  1, 7, 8] \           |     -|
7 [3, 4, 2,  7, 1, 8,  5, 6, 9] |  group 2 -| *   -| r in shuffle(rBase)
8 [1, 7, 8,  6, 5, 9,  3, 4, 2] /                 -|
                                            |
0 [6, 2, 5,  8, 4, 3,  7, 9, 1] \           |     -|
1 [7, 9, 1,  2, 6, 5,  4, 8, 3] |  group 0 -|     -| r in shuffle(rBase) 
2 [4, 8, 3,  9, 7, 1,  6, 2, 5] /           |     -|

                                * for g in shuffle(rBase)

And we can swap between the 3 rows of a group (e.g. 3,4,5) ...

0 [6, 2, 5,  8, 4, 3,  7, 9, 1] \                 -|
1 [7, 9, 1,  2, 6, 5,  4, 8, 3] |  group 0 -|     -| r in shuffle(rBase) 
2 [4, 8, 3,  9, 7, 1,  6, 2, 5] /           |     -|
                                            |
5 [9, 5, 7,  3, 2, 6,  8, 1, 4] \           |     -|
4 [2, 3, 6,  1, 8, 4,  9, 5, 7] |  group 1 -| *   -| r in shuffle(rBase)
3 [8, 1, 4,  5, 9, 7,  2, 3, 6] /           |     -|
                                            |
6 [5, 6, 9,  4, 3, 2,  1, 7, 8] \           |     -|
7 [3, 4, 2,  7, 1, 8,  5, 6, 9] |  group 2 -|     -| r in shuffle(rBase)
8 [1, 7, 8,  6, 5, 9,  3, 4, 2] /                 -|

                                * for g in shuffle(rBase)

We CANNOT swap rows across groups (e.g. 1 <--> 3):

0 [6, 2, 5,  8, 4, 3,  7, 9, 1] \                 -|
3 [8, 1, 4,  5, 9, 7,  2, 3, 6] |  group 0 -|     -| r in shuffle(rBase) 
2 [4, 8, 3,  9, 7, 1,  6, 2, 5] /           |     -|
                                            |
1 [7, 9, 1,  2, 6, 5,  4, 8, 3] \           |     -|
4 [2, 3, 6,  1, 8, 4,  9, 5, 7] |  group 1 -| *   -| r in shuffle(rBase)
5 [9, 5, 7,  3, 2, 6,  8, 1, 4] /           |     -|
                                            |
6 [5, 6, 9,  4, 3, 2,  1, 7, 8] \           |     -|
7 [3, 4, 2,  7, 1, 8,  5, 6, 9] |  group 2 -|     -| r in shuffle(rBase)
8 [1, 7, 8,  6, 5, 9,  3, 4, 2] /                 -|

                                * for g in shuffle(rBase)

See the duplicate 8 in the top left block, duplicate 7 below that, etc.

Single solution puzzle

In order to generate a sudoku puzzle with only one solution you will need a solver function that can tell you if there are more than one solution. The strategy I would suggest is to start with 75% (or more) of the numbers removed, then check that there is only one solution. If there is more than one solution, put back a number and check again. You can put back a number at a random position or select a position where the solutions differ (which will converge faster to a single solution puzzle)

First write a solver that will generate all solutions that it finds (ideally as a generator because we only need the first 2). Here's a simple one:

def shortSudokuSolve(board):
    side   = len(board)
    base   = int(side**0.5)
    board  = [n for row in board for n in row ]
    blanks = [i for i,n in enumerate(board) if n==0 ]
    cover  = { (n,p):{*zip([2*side+r, side+c, r//base*base+c//base],[n]*(n and 3))}
                for p in range(side*side) for r,c in [divmod(p,side)] for n in range(side+1) }
    used   = set().union(*(cover[n,p] for p,n in enumerate(board) if n))
    placed = 0
    while placed>=0 and placed<len(blanks):
        pos        = blanks[placed]
        used      -= cover[board[pos],pos]
        board[pos] = next((n for n in range(board[pos]+1,side+1) if not cover[n,pos]&used),0)
        used      |= cover[board[pos],pos]
        placed    += 1 if board[pos] else -1
        if placed == len(blanks):
            solution = [board[r:r+side] for r in range(0,side*side,side)]
            yield solution
            placed -= 1

Starting with a solution variable with all numbers present and board variable containing the puzzle with 3/4 of numbers cleared, you can add numbers back to the board until there is only one way to solve it:

solution=[[9, 5, 3, 1, 6, 7, 4, 2, 8],
          [4, 2, 8, 3, 5, 9, 7, 6, 1],
          [7, 6, 1, 8, 2, 4, 9, 5, 3],
          [5, 8, 4, 9, 3, 6, 2, 1, 7],
          [6, 3, 9, 7, 1, 2, 5, 8, 4],
          [2, 1, 7, 4, 8, 5, 6, 3, 9],
          [3, 4, 5, 6, 9, 1, 8, 7, 2],
          [8, 7, 2, 5, 4, 3, 1, 9, 6],
          [1, 9, 6, 2, 7, 8, 3, 4, 5]]    
board=[ [0, 0, 0, 0, 0, 0, 0, 0, 8],
        [0, 2, 0, 0, 5, 0, 7, 6, 0],
        [0, 6, 0, 0, 0, 0, 0, 0, 3],
        [5, 0, 0, 0, 0, 0, 2, 0, 7],
        [0, 3, 0, 0, 1, 0, 0, 0, 0],
        [2, 0, 0, 4, 0, 0, 0, 3, 0],
        [0, 0, 0, 6, 0, 0, 0, 0, 0],
        [8, 0, 0, 0, 0, 0, 0, 0, 0],
        [1, 0, 0, 2, 7, 0, 0, 4, 0]]
    
import random
from itertools import islice
while True:
    solved  = [*islice(shortSudokuSolve(board),2)]
    if len(solved)==1:break
    diffPos = [(r,c) for r in range(9) for c in range(9)
               if solved[0][r][c] != solved[1][r][c] ] 
    r,c = random.choice(diffPos)
    board[r][c] = solution[r][c]

output:

╔═══╤═══╤═══╦═══╤═══╤═══╦═══╤═══╤═══╗
║   │   │   ║   │   │ 7 ║   │   │ 8 ║
╟───┼───┼───╫───┼───┼───╫───┼───┼───╢
║   │ 2 │   ║   │ 5 │   ║ 7 │ 6 │   ║
╟───┼───┼───╫───┼───┼───╫───┼───┼───╢
║   │ 6 │   ║ 8 │   │ 4 ║   │   │ 3 ║
╠═══╪═══╪═══╬═══╪═══╪═══╬═══╪═══╪═══╣
║ 5 │   │   ║   │   │   ║ 2 │   │ 7 ║
╟───┼───┼───╫───┼───┼───╫───┼───┼───╢
║   │ 3 │   ║   │ 1 │   ║   │   │   ║
╟───┼───┼───╫───┼───┼───╫───┼───┼───╢
║ 2 │   │   ║ 4 │   │   ║   │ 3 │   ║
╠═══╪═══╪═══╬═══╪═══╪═══╬═══╪═══╪═══╣
║   │ 4 │   ║ 6 │   │   ║   │   │   ║
╟───┼───┼───╫───┼───┼───╫───┼───┼───╢
║ 8 │   │   ║   │   │   ║ 1 │   │ 6 ║
╟───┼───┼───╫───┼───┼───╫───┼───┼───╢
║ 1 │   │   ║ 2 │ 7 │   ║   │ 4 │   ║
╚═══╧═══╧═══╩═══╧═══╧═══╩═══╧═══╧═══╝

Note that this will work in a reasonable time for 9x9 sudoku boards but you will need a much better/faster solver function for larger boards

Also note that this will often produce "easy" puzzles as it will add in more numbers than absolutely necessary in some cases. Choosing the minimum set of numbers to add back-in would require a layered or backtracking approach which would be a bit more complex and much slower to run. Starting out with more empty spaces (e.g. 80% or more) will have a good chance of producing a more difficult puzzle within a reasonable timeframe.