algorithm concurrency erlang backtracking sudoku

Sudoku solver using a concurrent backtracking algorithm in Erlang

As my assignment I'm creating a sudoku solver in Erlang using concurrency. My initial idea was to use backtracking algorithm which would spawn new threads whenever it's making a choice.

However, after putting some time and thought into the project I'm starting to think that the way I wanted to solve this is a bit too complicated. Has anyone done something similiar in the past? Would you recommend some different algorithm that would work better with Erlang and concurrency?

Solution

BackTracking algorithm is not really adapted to use concurrency. Of course, it is always possible to spawn several processes in parallel that start with different initial conditions (all possible values of the first or 2 first cells to solve). But I don't think this is a true concurent application.

A more suitable algorithm is the propagation of constraints. The idea is to create a process per cell, each cell knowing the 20 "connected cells" processes (8 in the same column, 8 in the same line, and 4 more in the same square). A cell has a state which contain - at least - all the possible values it can take. If a cell has only one single possible value remaining, after initialisation or during the propagation of constraint, it sends a message {remove,Value} to all its connected cells to inform them to remove the value from their list.

It is a true concurrent process, but it has (at least) 2 issues: - knowing when a solution is found or when the propagation is stuck; - Only the simplest puzzles will be solved by this algorithm in one shot.

There are some other rules that could be used to solve more complex puzzles. For example look for numbers that have only one remaining possibility, looking for pairs... But these rules are not really easy to implement in parallel, and I don't know the set of rules necessary to solve any puzzle.

Note In the general case the set of rule does not exist since a puzzle may have multiple solutions, although it is not the case for the puzzles we can find in newspapers.

My idea is to complete the constraint propagation algorithm with a search algorithm. A new process, the controller, is in charge to: - initialize the puzzle - select the most promissing trial - ask to make the trial to the cell processes, - control the end of the propagation process, - check if it is - a solution -> print it - a dead end -> ask to come back to previous state, remove the initial trial number from the list, and select the next most promising trial - ask to store the current result state and continue to the next trial

So the cells have to complete their state with a stack where they can push and pop their current list of possible value.

The most promising trial can be the selected this way: find the cell with the less remaining possible values, and take the first one.

The next problem is to synchronize everything. The first "simple" solution is to use a timeout. But as always, a timeout is very difficult to define, and at the end very inefficient. I would keep a timeout only for debug purpose, so with a rather big value, because there are some risks that it doesn't work at the first attempt :o).

An alternative to the timeout is to use a counter. Each time the controller sends a message that need synchronization, it increments its counter. Each time a cell has complete the handling of a message that needs synchronization, it returns an {ack_synchro,N} message to the controller, which in turn subtracts N to its counter. Doing this, during the propagation of constraint, when a cell has only one remaining possible value, it can send an {ack_synchro,-20} to the controller before sending the {remove,Value} to its connected cells so the controller "knows" it has to wait for 20 messages more. With this principle, it is possible to synchronize the activity of the cells for the push, pop, {try,Value}, {remove,Value} messages.

I guess it is missing a lot of details, and I am not sure that it will be faster than the imperative backtracking, but it should work at a reasonable coding cost, and it is concurrent.

Edit

I coded this proposal, it is tested with only 2 test cases, one simple puzzle, one complex, and it works fine. Here is the code:

The main module (actually it runs in the shell process) to solve a puzzle use the command: sudo:start(file,Filename) or sudo:start(table,InitialList):

-module (sudo).

-export ([start/2]).

start(file,File) ->
    {ok,[Table]} = file:consult(File),
    start(table,Table);
start(table,Table) ->
    init_cells(),
    solve(Table),
    print_result(),
    stop().

stop() ->
    lists:foreach(fun(X) -> cell:stop(X) end ,cells()).

cells() ->
    [a1,a2,a3,a4,a5,a6,a7,a8,a9,
    b1,b2,b3,b4,b5,b6,b7,b8,b9,
    c1,c2,c3,c4,c5,c6,c7,c8,c9,
    d1,d2,d3,d4,d5,d6,d7,d8,d9,
    e1,e2,e3,e4,e5,e6,e7,e8,e9,
    f1,f2,f3,f4,f5,f6,f7,f8,f9,
    g1,g2,g3,g4,g5,g6,g7,g8,g9,
    h1,h2,h3,h4,h5,h6,h7,h8,h9,
    i1,i2,i3,i4,i5,i6,i7,i8,i9].

init_cells() ->
    lists:foreach(fun(X) -> cell:start_link(X) end ,cells()),
    Zip = lists:zip(cells(),lists:seq(0,80)),
    lists:foreach(fun({N,P}) -> cell:init(N,neighbors(P,Zip)) end, Zip),
    wait(81).

neighbors(P,Zip) ->
    Line = fun(X) -> X div 9 end,
    Col = fun(X) -> X rem 9 end,
    Square = fun(X) -> {Line(X) div 3, Col(X) div 3} end,
    Linked = fun(X) ->  (X =/= P) andalso
                        (   (Line(X) == Line(P)) orelse
                            (Col(X) == Col(P)) orelse
                            (Square(X) == Square(P))) end,
    [Name || {Name,Pos} <- Zip, Linked(Pos)].


solve(Table) ->
    Zip = lists:zip(cells(),Table),
    test(Zip),
    do_solve(is_solved()).

do_solve({true,_,_,_}) ->
    done;
do_solve({false,Name,Value,_}) ->
    push(),
    test(Name,Value),
    do_solve(is_solved());
do_solve(error) ->
    pop(),
    {false,Name,Value,_}  = is_solved(),
    remove(Name,Value),
    do_solve(is_solved()).

print_result() ->
    R = get_cells(),
    F = fun({_,[I]},Acc) ->
            case Acc of
                _ when (Acc rem 27) == 0 -> io:format("~n~n ~p",[I]);
                _ when (Acc rem 9) == 0 -> io:format("~n ~p",[I]);
                _ when (Acc rem 3) == 0 -> io:format("  ~p",[I]);
                _ -> io:format(" ~p",[I])
            end,
            Acc+1
        end,
    lists:foldl(F,0,R),
    io:format("~n").

test(List) ->
    F = fun({_,0},Acc) ->
                Acc;
           ({Name,Value},Acc) ->
                cell:test(Name,Value),
                Acc+1
        end,
    NbMessages = lists:foldl(F,0,List),
    wait(NbMessages).

test(_,0) -> ok;
test(Name,Value) ->
    cell:test(Name,Value),
    wait(1).

remove(Name,Value) ->
    cell:remove(Name,Value),
    wait(1).

push() ->
    lists:foreach(fun(X) -> cell:push(X) end, cells()),
    wait(81).

pop() ->
    lists:foreach(fun(X) -> cell:pop(X) end, cells()),
    wait(81).

wait(0) ->
    done;
wait(NbMessages) ->
    receive
        {done,N} -> wait(NbMessages-N);
        {add,N}  -> wait(NbMessages+N)
    after 2000 ->
        error
    end.

get_cells() ->
    F = fun(X) -> cell:get_val(X), receive {possible,M} -> M end, {X,M} end,
    [F(X) || X <- cells()].

is_solved() ->
    State = get_cells(),
    F = fun({_,[]},_) -> error;
           (_,error) -> error;
           ({Name,List},Acc = {_,_CurName,_CurVal,Length}) ->
               NL = length(List),
               case (NL > 1) andalso( NL < Length) of
                    true -> {false,Name,hd(List),NL};
                    false -> Acc
           end
        end,
    lists:foldl(F,{true,none,0,10},State).

The Cell server and its interfaces

-module (cell).
-export ([start_link/1,init/2,push/1,pop/1,test/2,remove/2,stop/1,get_val/1]).

% Interfaces

start_link(Name) ->
    Pid = spawn_link(fun() -> init() end),
    register(Name,Pid).

init(Name,List) ->
    Name ! {init,self(),List}.

push(Name) ->
    Name ! push.

pop(Name) ->
    Name ! pop.

test(Name,Value) ->
    Name ! {test,Value}.

remove(Name,Value) ->
    Name ! {remove,Value}.

get_val(Name) ->
    Name ! get.

stop(Name) ->
    Name ! stop.

% private

init() ->
    loop(none,[],[],[]).

loop(Report,Possible,Stack,Neighbors) ->
    receive
       {init,R,List} ->
           R ! {done,1},
           loop(R,lists:seq(1,9),[],List);
       push ->
           Report ! {done,1},
           loop(Report,Possible,[Possible|Stack],Neighbors);
       pop ->
           Report ! {done,1},
           loop(Report,hd(Stack),tl(Stack),Neighbors);
       {test,Value} ->
            NewP = test(Report,Possible,Neighbors,Value),
            loop(Report,NewP,Stack,Neighbors);
       {remove,Value} ->
            NewP = remove(Report,Possible,Neighbors,Value),
            loop(Report,NewP,Stack,Neighbors);
       get ->
            Report ! {possible,Possible},
            loop(Report,Possible,Stack,Neighbors);
       stop ->
            ok
    end.

test(Report,Possible,Neighbors,Value) ->
    true = lists:member(Value,Possible),
    Report ! {add,20},
    lists:foreach(fun(X) -> remove(X,Value) end, Neighbors),
    Report ! {done,1},
    [Value].

remove(Report,Possible,Neighbors,Value) ->
    case Possible of
        [Value,B] ->
            remove(Report,B,Neighbors);
        [A,Value] ->
            remove(Report,A,Neighbors);
        _ ->
            Report ! {done,1}
    end,
    lists:delete(Value,Possible).

remove(Report,Value,Neighbors) ->
    Report ! {add,20},
    lists:foreach(fun(X) -> remove(X,Value) end, Neighbors),
    Report ! {done,1}.

a test file:

[
0,0,0,4,0,6,9,0,0,
0,0,0,0,0,0,1,0,0,
0,0,0,3,0,0,0,7,2,
0,0,5,6,4,0,0,0,0,
0,2,3,0,8,0,0,0,1,
0,8,0,0,0,2,4,0,5,
0,7,8,0,0,0,5,0,0,
6,0,1,0,0,7,2,0,0,
0,0,2,0,0,9,0,0,0
].

in action:

1> c(sudo).
{ok,sudo}
2> c(cell).
{ok,cell}
3> timer:tc(sudo,start,[file,"test_hard.txt"]).


 1 3 7  4 2 6  9 5 8
 2 6 9  7 5 8  1 4 3
 8 5 4  3 9 1  6 7 2

 7 1 5  6 4 3  8 2 9
 4 2 3  9 8 5  7 6 1
 9 8 6  1 7 2  4 3 5

 3 7 8  2 1 4  5 9 6
 6 9 1  5 3 7  2 8 4
 5 4 2  8 6 9  3 1 7
{16000,ok}
4>

No comments in the code, but it does exactly what I propose in the first part of the answer.