thread-safety pthreads fork fifo multiple-processes

Fifo timing with multiple threads. Two writes to fifo from one thread only work with sleep(1)

I am making an application which forks in the beginning into two processes. Simplified, one of the processes continuously reads from a fifo file, the other process occasionally writes notifications to the fifo file. Now, everything works fine, unless the writing process writes calls it's writing method quickly in succession. Then only the first notification gets written (or read?) from the fifo file.

This is the code of the fifo reading process:

void logging(){
    int fd1;

    char * myfifo = "/home/jens/Desktop/CLion_projects/Labo9/logfifo";
    mkfifo(myfifo, 0666);
    char str1[20];
    while(1) {
        pthread_mutex_lock(&lock_fifo);
        fd1 = open(myfifo,O_RDONLY);                            
        read(fd1, str1, 1000);

        printf("%s\n", str1);
        close(fd1);
        pthread_mutex_unlock(&lock_fifo);
    }
}

This is the method which writes to the fifo file (in the other process):

void write_to_fifo(char * string_to_write){
    int fd;
    char * myfifo = "/home/jens/Desktop/CLion_projects/Labo9/logfifo";
//    mkfifo(myfifo, 0666);                             (should this be on or not?)

    char * arr2 = string_to_write;
    fd = open(myfifo, O_WRONLY);
    write(fd, arr2, strlen(arr2)+1);
    close(fd);
}

These are two calls that use the write_to_fifo(..) method:

    write_to_fifo("Connection to SQL server established.\n");
//    sleep(1);
    write_to_fifo("New table "TO_STRING(TABLE_NAME)" created.\n");

The first one is always printed correctly, the second one only works when sleep(1) is uncommented. Because of this, I'm guessing it has something to do with timing. This wouldn't be a problem to leave sleep(1) in there, if I wasn't running multiple threads at the same time. I imagine running multiple threads makes the timing unpredictable, and you cannot add a sleep(1) line between the function calls of different threads.

Why does this program only work when the delay is introduced?
Is this how it is supposed to be?
If not, how do I overcome this?

Solution

Why does this program only work when the delay is introduced?

Because FIFOs are stream-oriented, not message-oriented. It is not safe to assume that the data written by one write() call will be read as a single and complete unit by any read() call. In your particular case, if your write_to_fifo() function is called multiple times in quick succession then you may get two or more writes occuring between reads, in which case one read might get all the data from both writes, instead of just from the first.

Furthermore, if that happens then it is hidden from you because your writes (intentionally, it seems) include string terminators. However much the reader reads, then, its printf() will output data only up to the first terminator. That is, I see no reason to think that you are losing messages in the fifo; rather, I am confident that you are losing them in the reader.

Is this how it is supposed to be?

The behavior you describe seems consistent to me, as described above.

If not, how do I overcome this?

Even though the behavior is "how it is supposed to be" in some sense, that does not mean you cannot obtain behavior you like better. Since it is (almost surely) the reader that is losing messages, you can fix that by making the reader smarter. You ought to be able to do that even without modifying the writer, but modifying the writer might make it easier, and there are other reasons why you might want to modify it.

In the first place, you must always consider the return value of the read() and write() functions, moreso even than most functions. Not only does the return value report on error conditions and (for read()) end-of-file conditions, but it also tells you many bytes each call transferred. That's important information on both sides, because the number transferred by either function can be less than the number requested. Generally speaking, one must be prepared to call write() or read() in a loop to ensure that all wanted bytes are transferred. In your particular case, you don't necessarily need to read a certain number of bytes, but paying attention to how many bytes were actually read could enable you to recognize when you've obtained parts of multiple messages via the same read call.

In the second place, although null-terminated strings are an ok in-memory representation, they are not a particularly good an on-the-wire representation. Since you apparently want to print each message followed by a newline, you could consider newline-terminated data instead. In that case, the reader might not even need to worry about message boundaries -- if all you want it to do is dump the messages to the standard output, then it might just read from the fifo and dump everything (using the byte count) to the output file, including the newlines in the data, without worrying about message boundaries.

But if you want to handle variable-length messages in per-message units then a better protocol would help you. For example, send messages in the form of a fixed-length message length followed by that number of message bytes. That way, the reader always knows how many bytes to read (even if it has to use multiple read() calls to get them).

In the third place, reader and writers should not keep opening and closing the fifo. Each process should open it once, and hold it open as long as needed. Either or both may create the fifo, once, though if you have both of them do it then you need to be prepared for at least one to fail to do so (on account of the other having done it first). Multiple threads of the writer can share the same file descriptor, and in fact that is often preferable to multiple threads all opening the same file separately. You could consider having the writer threads call fsync() after each message instead of closing the file, but that's probably unnecessary if they are all using the same FD.

In the multithreaded case, using the same FD for all threads, you do not need to worry about the data from one write call by one thread being interleaved with the data from a write call by a different thread. However, you do need to be aware that if an outbound message ends up being split over more than one write call then you might sometimes get another thread's write interposed between. You can use a mutex to ensure that messages are not split up that way when they get spread over multiple writes.

If there is only one reader, however, then it's unclear to me what is gained by using a mutex on that side.