Search code examples
cmacosfreebsdbsdlaunch-daemon

pthread_mutex_lock and an abandoned shared mutex


Pardon me if I'm asking the obvious. I come from years of programming under Windows.

Currently I'm working on a project that is running under macOS. (And I believe it uses Free BSD under the hood.)

So I need to synchronize access to a shared resource from multiple processes (one is a launch daemon and others are client processes.)

I chose to do so using a shared (named) mutex.

I couldn't see any direct implementation of a named mutex by the kernel. All I could find were implementations that used shared memory, that went as such:

  1. Create/open a shared memory segment, either with shm_open or with shmget+shmat
  2. Use a regular pthread_mutex_t but initialize it with attributes: pthread_mutexattr_setpshared(&mtxAtt, PTHREAD_PROCESS_SHARED);
  3. Then create it as a regular mutex: pthread_mutex_init
  4. And use it to lock and unlock with pthread_mutex_lock and pthread_mutex_unlock.

This works, except for one edge case:

Say, a client acquires this shared mutex with a call to pthread_mutex_lock, and then either crashes, or is terminated by a user, thus it never calls pthread_mutex_unlock. (In Microsoft parlance they call the state that that mutex would be in, an "abandoned" mutex. And in their case, if another process tries to acquire such mutex, this will result in an error from a locking function to denote the situation.)

In my case though, for macOS, if my launch daemon tries to call pthread_mutex_lock on the same shared mutex (after it became "abandoned"), it will deadlock forever.

Thus, I'm wondering, if there is a way to determine if a mutex is in such "abandoned" state? (and avoid a deadlock of locking such a mutex and potentially deadlocking the thread.)


EDIT: While trying to understand how to use System V semaphore, as an IPC mutex, suggested in the comments, I was able to find the following code snippet (using AI):

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/sem.h>

To init:

key_t key = ftok("/tmp/semfile", 1);
if (key < 0) {
    perror("ftok");
    exit(EXIT_FAILURE);
}

int semid = semget(key, 1, IPC_CREAT | 0666);
if (semid < 0) {
    perror("semget");
    exit(EXIT_FAILURE);
}

// initialize the semaphore with a value of 1
if (semctl(semid, 0, SETVAL, 1) < 0) {
    perror("semctl");
    exit(EXIT_FAILURE);
}

Then to enter mutex:

struct sembuf sops = { 0, -1, SEM_UNDO };

// acquire the semaphore (lock)
if (semop(semid, &sops, 1) < 0) {
    perror("semop");
    exit(EXIT_FAILURE);
}

and to leave it:

sops.sem_op = 1;

// release the semaphore (unlock)
if (semop(semid, &sops, 1) < 0) {
    perror("semop");
    exit(EXIT_FAILURE);
}

And finally, cleanup:

// destroy the semaphore
if (semctl(semid, 0, IPC_RMID) < 0) {
    perror("semctl");
    exit(EXIT_FAILURE);
}

The problem though, that it works only in the same process. For the IPC case, I can enter and leave as if the semaphores in each process are not connected to each other.

What is missing there?


Solution

  • OK, so I spent 3 days trying to do what on Windows would've taken me less than an hour.

    My conclusion: macOS does not have a reliable IPC mutex.


    So if someone finds a solution that meets the criteria that I outlined above, I'll be glad to hear it.

    I investigated so far:

    • POSIX mutexes via pthread_mutex_t and a shared memory (like I showed above) lack one critical function: PTHREAD_MUTEX_ROBUST_NP or pthread_mutexattr_setrobust_np. Without them, they act just like semaphors.

    • Named semaphores exist (via sem_open) but they lack the basic need to protect against a dead process that doesn't unlock them (abandoned mutex issue that I described in my OP.)

    • System V evidently provides means to safeguard against unexpectedly terminated process via the SEM_UNDO flag (like I showed in my OP) but I failed to find a way to make it work across processes. And the documentation (for this stuff) is abysmal to say the least.

    • In total desperation to find a solution, I even tried to use "abstract UNIX domain sockets", that are supposedly closed automatically if a process dies. Something as someone had shown here. But guess what, macOS does not seem to support them and the bind function simply returns errno 2.

    • I then thought to implement my own mutex via shared memory, by keeping a thread ID of the locking process in it. But again, there's no synchronization principals to work across processes. For instance, there's no such thing as a named event. One can use a named semaphore for it, but that brings me back to the limitation I named earlier.

    • Lastly, this absolutely idiotic design pattern when one needs to call unlink to remove a named object besides calling close on it totally removes any possibility of a robust implementation. For instance, if one process crashes, it leaves a named object in an undefined state for the next process that wants to use it: how would it know what happened? Is the named object still being used? Or is it simply abandoned. The only way to remedy this "for sure" is to restart the macOS. Which is totally bunkers!

    I am really baffled by my discovery, as there should be synchronization objects that work across processes. How do they write their OS? Or, everyone is just using files and then has to unlink them? No wonder sometimes I have to restart my Mac to fix some issue in an app.