Search code examples
cconcurrencylockingforkshared-memory

Lock memory shared between forked processes in C


Building on top of this answer, I would like to make a program that forks itself to distribute a large number of independent computations over the CPUs of a machine. The computations output a number in the set {0, ..., n - 1}. My goal is simply to count the number of occurrences of each output in an array of size n.

To be clear, here is what the main loop in each of the child processes would look like:

for (i = start_range_for_this_process; i < end_range_for_this_process; i++)
{
    input = generate_input(i);
    output = the_computation(input);
    count[output]++;
}

My question is the following: Is the answer I linked to above safe if many processes try to increment the same entry in the count array at the same time? If not, is there a way to make the processes use the array one at a time by asking them to lock it when it is free and wait when it is not free?

I tried searching on the web, but I didn't find what I was looking for. I might not be using the right keywords, since I am not very experienced with concurrent programming. Thanks for any help!


Solution

  • Is the answer I linked to above safe if many processes try to increment the same entry in the count array at the same time?

    No, it's not safe. At the very least you can get incorrect results, where some increments may get "dropped" (because another process overwrites the value between when it is read and when the incremented value is written back). See Can num++ be atomic for 'int num'?

    If not, is there a way to make the processes use the array one at a time by asking them to lock it when it is free and wait when it is not free?

    Well, you could have a semaphore for the array, or one for each array element, which each process tries to decrement before accessing and increments afterwards; or even use the semaphore itself as the counter. (See Lock, mutex, semaphore... what's the difference? for more general info.) But this is expensive. A more efficient way is to use the atomic features in C11 and later to do atomic increment operations on your array:

    #include <stdatomic.h>
    ...
    atomic_int *count = attach_shared_memory();
    for (i = start_range_for_this_process; i < end_range_for_this_process; i++)
    {
        input = generate_input(i);
        output = the_computation(input);
        atomic_fetch_add_explicit(&count[output], 1, memory_order_relaxed);
    }
    

    But for that matter, if all you need to do is keep a count of the total number of times each output value is seen, you can avoid all of this. Just have each process keep its own private count array for how many times each output has been seen in this process, and then add them all together at the end; addition is associative, after all. You could have each worker process send its results back to the parent through a pipe, or through its own unique shared memory segment, or various other simple ways.