Search code examples
timeoutdeadlocksendpromela

How to send and receive in a PROMELA process without timeout / deadlock?


I cannot wrap my head around this PROMELA problem: I have N processes ("pc") which may both send and receive messages over a channel ("to_pc"). Each process has its own channel over which it receives messages. For a process to be able to receive, I have to keep it in a loop which checks the channel for incoming messages. As a second loop option the process sends a message to all other channels.

However, in simulation mode, this always causes a timeout, without anything being sent at all. My theory so far is that I created a deadlock where all processes want to send at once, causing them all to be unable to receive (since they are stuck in their "send" part of the code).

So far I have been unable to resolve this problem. I have tried to use a global variable as a semaphore to "forbid" sending, so that only one channel may send. However, this did not change the results. My only other idea is to use a timeout as the trigger for the sending, but this does not seem right to me at all.

Any ideas? Thanks in advance!

#define N 4

mtype={request,reply}

typedef message {
    mtype type;
    byte target;
    byte sender;
};

chan to_pc[N] = [0] of {message}

inline send() {
    byte j = 0;
    for (j : 0 .. N-1) {
        if
        :: j != address ->
            to_pc[j]!msg;
        :: else;
        fi
    }
}

active [N] proctype pc(){
    byte address = _pid;
    message msg;

    do
    :: to_pc[address]?msg -> /* Here I am receiving a message. */
        if
        ::msg.type == request->
            if
            :: msg.target == address ->
                d_step {
                    msg.target = msg.sender
                    msg.sender = address;
                    msg.type = reply;
                }
                send();
            :: else
            fi
        :: msg.type == reply;
        :: else;
        fi
    :: /* Here I want to send a message! */
        d_step {
            msg.target = (address + 1) % N;
            msg.sender = address;
            msg.type = request;
        }
        send();
    od
};

Solution

  • I can write a full-fledged working version of your source code if you want, but perhaps it is sufficient to highlight the source of the issue you are dealing with and let you have fun solving it.


    Branching Rules

    • any branch with an executable condition can be taken, non-deterministically
    • if there is no branch with an executable condition, the else branch is taken
    • if there is no branch with an executable condition and no else branch, then the process hangs till when one of the conditions becomes true

    Consider this

    1: if
    2:    :: in?stuff -> ...
    3:    :: out!stuff -> ...
    4: fi
    

    where in and out are both synchronous channels (size is 0).

    Then

    • if someone is sending on the other end of in then in?stuff is executable, otherwise it is not
    • if someone is receiving on the other end of out then out!stuff is executable, otherwise it is not
    • the process blocks at line 1: up until when at least one of the two conditions is executable.

    Compare that code to this

    1: if
    2:    :: true -> in?stuff; ...
    3:    :: true -> out!stuff; ...
    4: fi
    

    where in and out are again synchronous channels (size is 0).

    Then

    • both branches have an executable condition (true)
    • the process immediately commits itself to either send or receive something, by non-deterministically choosing to execute a branch either at line 2: or 3:
    • if the process chooses 2: then it blocks if in?stuff is not executable, even when out!stuff would be executable
    • if the process chooses 3: then it blocks if out!stuff is not executable, even when in!stuff would be executable

    Your code falls in the latter situation, since all the instructions within d_step { } are executable and your process commits to send way too early.


    To sum up: in order to fix your model, you should refactor your code so that it's always possible to jump from send to receive mode and viceversa. Hint: get rid of that inline code, separate the decision to send from actual sending.