Search code examples
cshared-memorycompiler-optimizationvolatilestrncmp

Why does the compiler optimize away shared memory reads due to strncmp() even if volatile keyword is used?


Here is a program foo.c that writes data to shared memory.

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
#include <stdint.h>
#include <unistd.h>
#include <sys/ipc.h>
#include <sys/shm.h>

int main()
{
    key_t key;
    int shmid;
    char *mem;

    if ((key = ftok("ftok", 0)) == -1) {
        perror("ftok");
        return 1;
    }

    if ((shmid = shmget(key, 100, 0600 | IPC_CREAT)) == -1) {
        perror("shmget");
        return 1;
    }

    printf("key: 0x%x; shmid: %d\n", key, shmid);

    if ((mem = shmat(shmid, NULL, 0)) == (void *) -1) {
        perror("shmat");
        return 1;
    }

    sprintf(mem, "hello");
    sleep(10);
    sprintf(mem, "exit");

    return 1;
}

Here is another program bar.c that reads data from the same shared memory.

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
#include <stdint.h>
#include <unistd.h>
#include <sys/ipc.h>
#include <sys/shm.h>

int main()
{
    key_t key;
    int shmid;
    volatile char *mem;

    if ((key = ftok("ftok", 0)) == -1) {
        perror("ftok");
        return 1;
    }

    if ((shmid = shmget(key, sizeof (int), 0400 | IPC_CREAT)) == -1) {
        perror("shmget");
        return 1;
    }

    printf("key: 0x%x; shmid: %d\n", key, shmid);

    if ((mem = shmat(shmid, NULL, 0)) == (void *) -1) {
        perror("shmat");
        return 1;
    }

    printf("looping ...\n");
    while (strncmp((char *) mem, "exit", 4) != 0)
        ;

    printf("exiting ...\n");

    return 0;
}

I run the writer program first in one terminal.

touch ftok && gcc foo.c -o foo && ./foo

While the writer program is still running, I run the reader program in another terminal.

gcc -O1 bar.c -o bar && ./bar

The reader program goes into an infinite loop. It looks like the optimizer has optimized the following code

    while (strncmp((char *) mem, "exit", 4) != 0)
        ;

to

    while (1)
        ;

because it sees nothing in the loop that could modify the data at mem after it has been read once.

But I declared mem as volatile precisely for this reason; to prevent the compiler from optimizing it away.

volatile char *mem;

Why does the compiler still optimize away the reads for mem?

By the way, I have found a solution that works. The solution that works is to modify

    while (strncmp((char *) mem, "exit", 4) != 0)
        ;

to

    while (mem[0] != 'e' || mem[1] != 'x' || mem[2] != 'i' || mem[3] != 't')
        ;

Why is it that the compiler optimizes away strncmp((char *) mem, "exit", 4) != 0 but does not optimize away mem[0] != 'e' || mem[1] != 'x' || mem[2] != 'i' || mem[3] != 't' even though char *mem is declared to be volatile in both cases?


Solution

  • By writing (char *)mem you are telling the strncmp function that it it is actually not a volatile buffer. And indeed, strncmp and the other C library functions are not designed to work on volatile buffers.

    You do in fact need to modify your code to not use C library functions on volatile buffers. Your options include:

    • Write your own alternative to the C library function that works with volatile buffers.
    • Use a proper memory barrier.

    You've gone with the first option; but think about what would happen if the other process modified the memory in between your four reads. To avoid this sort of problem you'd need to use the second option, an inter-process memory barrier -- in which case the buffer no longer needs to be volatile and you can go back to using the C library functions. (The compiler must assume that the barrier check might change the buffer).