I have some code here that solves the n queens problem using backtracking in python. When I run it, the odds always take much less time than the evens. This is especially evident when the n gets to around 20+. Does anyone know why this is?
import time
global N
N = int(input("Enter N: "))
def display_solution(board):
print('\n'.join(['\t'.join([str(cell) for cell in row]) for row in
board]))
def safe(board, row, col):
for i in range(col):
if board[row][i] == 1:
return False
for i, j in zip(range(row, -1, -1), range(col, -1, -1)):
if board[i][j] == 1:
return False
for i, j in zip(range(row, N, 1), range(col, -1, -1)):
if board[i][j] == 1:
return False
return True
def solve_help(board, col):
if col >= N:
return True
for i in range(N):
if safe(board, i, col):
board[i][col] = 1
if solve_help(board, col + 1) is True:
return True
board[i][col] = 0
return False
def solve():
board = [[0 for x in range(N)] for y in range(N)]
if solve_help(board, 0) is False:
print("Solution does not exist")
return False
display_solution(board)
return True
start = time.clock()
solve()
end = time.clock()
print(N, "\t", end-start)
I'm assuming it must have something to do with the diagonals being different for odds as opposed to evens. I'm also not sure if this is an issue with all backtracking algorithms for this problem, or just this specific code. Either way, thanks for the help.
The algorithm takes considerable more time when in one of the first columns backtracking occurs and a next row must be tried. And comparing odd-N boards with N-1 boards shows indeed that often the solution for the even board needs to do more such backtracking/try-next processing. For example the top-left corner of the solution for N=19 looks like this:
1 0 0 0 0
0 0 0 1 0
0 1 0 0 0
0 0 0 0 1*
0 0 1 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
These 5 queens in the first five columns are found quickly, as they are the first that do not collide with the previous queens. And apparently the other queens can be placed without having to reconsider these first five queens.
For N=18 that same corner of the solution looks like this:
1 0 0 0 0
0 0 0 1 0
0 1 0 0 0
0 0 0 0 0-
0 0 1 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 1*
Note the position marked with a minus. This one looks promising (as it was for the 19-board): its investigation takes some considerable time before the conclusion is drawn that the other queens cannot be placed correctly. This early failure costs.
Thus the solution for the 19 board is found sooner than for the 18 board.
Note that the solution for 27 takes slightly more time than the one for 26, although this is not significant: it looks like the time complexity is O(2n), and so to compare times of different board sizes it would better be done on a logarithmic Y-axis:
"work" represents the number of times the function safe
is executed.
Whether this algorithm always takes relatively more time for even boards (compared to the time needed for a N+1 board) is unclear, but for these few board sizes it seems related to the knight-jumps that are naturally formed by this algorithm, starting from the top-left corner. Note how this pattern works out perfectly for board sizes 5 and 7: the first spot where the next queen can sit without interfering with the previous positioned queens is always part of the solution. While for board sizes 4 and 6 there isn't even any solution with a queen in a corner, which is the starting point of this algorithm.
To take this question from a programmer's point of view, there is one remedy whereby the difference will (on average) evaporate: pick the columns in a different (or even random) order. It turns out that taking the normal order is one of the less efficient ways to the get a solution.
A simple shift in this algorithm, where you do not consider the first two rows unless all other fail, already changes the stats considerably:
In solve_help
change this:
for i in range(N):
to:
for i in range(N):
i = (i + 2) % N
See how now the average performance has improved: all measures of log(work)/n are below 1, except for n=6. But also: the peeks are now more often for odd values of N.
for i in random.sample(range(N), N):
Here is how one such random run turned out:
Much better stats than the original order! Of course, you will get a bad stat now and then, ... because it is random. But on average it performs (much) better.
Other ways of non-random order could include col
, and N//2
with differing coefficients, like i = (i + col*2 + N//2) % N
, ...etc. But see final remark below.
I would apply the following optimisation: keep track of which rows, forward "diagonals" and backward "diagonals" are already taken. You can use list(s) or set(s) for that. Note that two cells are in the same forward diagonal if the sum of their coordinates are equal. Cells on the backward diagonals have in common that the difference of their coordinates is equal. That way you don't have to scan for a "1" in these lines each time.
Also, board
could be just a list of column numbers. There is no need to store all those zeroes. Keep that for the display only.
Finally, there are simple ways to get a solution in linear time. See Wikipedia.