Is this an appropriate use case for a recursive mutex?

I've heard from various sources (1, 2) that one should avoid using recursive mutexes as it may be a sign of a hack or bad design. Sometimes, however, I presume they may necessary. In light of that, is the following an appropriate use case for a recursive mutex?

// main.c
// gcc -Wall -Wextra -Wpedantic main.c -pthread

#ifndef _GNU_SOURCE
#define _GNU_SOURCE
#endif /* _GNU_SOURCE */

#include <assert.h>
#include <pthread.h>
#include <stdlib.h>

typedef struct synchronized_counter
{
    int count;
    pthread_mutex_t mutex;
    pthread_mutexattr_t mutexattr;
} synchronized_counter;

synchronized_counter* create_synchronized_counter()
{
    synchronized_counter* sc_ptr = malloc(sizeof(synchronized_counter));
    assert(sc_ptr != NULL);

    sc_ptr->count = 0;

    assert(pthread_mutexattr_init(&sc_ptr->mutexattr) == 0);
    assert(pthread_mutexattr_settype(&sc_ptr->mutexattr, 
        PTHREAD_MUTEX_RECURSIVE) == 0);

    assert(pthread_mutex_init(&sc_ptr->mutex, &sc_ptr->mutexattr) == 0);

    return sc_ptr;
}

void synchronized_increment(synchronized_counter* sc_ptr)
{
    assert(pthread_mutex_lock(&sc_ptr->mutex) == 0);

    sc_ptr->count++;

    assert(pthread_mutex_unlock(&sc_ptr->mutex) == 0);
}

int main()
{
    synchronized_counter* sc_ptr = create_synchronized_counter();

    // I need to increment this counter three times in succesion without having
    // another thread increment it in between. Therefore, I acquire a lock
    // before beginning.

    assert(pthread_mutex_lock(&sc_ptr->mutex) == 0);

    synchronized_increment(sc_ptr);
    synchronized_increment(sc_ptr);
    synchronized_increment(sc_ptr);

    assert(pthread_mutex_unlock(&sc_ptr->mutex) == 0);

    return 0;
}

EDIT:

I wanted to ask the question with a simple example, but perhaps it came across as too simple. This is what I had imagined would be a more realistic scenario: I have a stack data structure that will be accessed by multiple threads. In particular, sometimes a thread will be popping n elements from the stack, but it must do so all at once (without another thread pushing or popping from the stack in between). The crux of the design issue is if I should have the client manage locking the stack themselves with a non-recursive mutex, or have the stack provide synchronized, simple methods along with a recursive mutex which the client can use to make multiple atomic transactions that are also synchronized.

Solution

Both of your examples - the original synchronized_counter and the stack in your edit - are correct examples of using a recursive mutex, but they would be considered poor API design if you were building a data structure. I'll try to explain why.

Exposing internals - the caller is required to use the same lock that protects internal access to members of the data structure. This opens the possibility to misusing that lock for purposes other than accessing the data structure. That could lead to lock contention - or worse - deadlock.
Efficiency - it's often more efficient to implement a specialized bulk operations like increment_by(n) or pop_many(n).
- First, it allows the data structure to optimize the operations - perhaps the counter can just do count += n or the stack could remove n items from a linked list in one operation. [1]
- Second, you save time by not having to lock/unlock the mutex for every operation.[2]

Perhaps a better example for using a recursive mutex would be as follows:

I have a class with two methods Foo and Bar.
The class was designed to be single-threaded.
Sometimes Foo calls Bar.

I want to make the class thread-safe, so I add a mutex to the class and lock it inside Foo and Bar. Now I need to make sure that Bar can lock the mutex when called from Foo.

One way to solve this without a recursive mutex is to create a private unsynchronized_bar and have both Foo and Bar call it after locking the mutex.

This can get tricky if Foo is a virtual method that can be implemented by a sub-class and used to call Bar, or if Foo calls out to some other part of the program that can call back into Bar. However, if you have code inside critical sections (code protected by a mutex) calling into other arbitrary code, then the behaviour of the program will be hard to understand, and it is easy to cause deadlocks between different threads, even if you use a recursive mutex.

The best advice is to solve concurrency problems through good design rather than fancy synchronization primitives.

[1] There are some tricker patterns like "pop an item, look at it, decide if I pop another one", but those can be implemented by supplying a predicate to the bulk operation.

[2] Practically speaking locking a mutex you already own should be pretty cheap, but in your example it requires at least a call to an external library function, which cannot be inlined.