Search code examples
cpthreadsposix

When does the routine passed to pthread_create start?


Given the following code

#include <pthread.h>

void *pt_routine(void *arg)
{
    pthread_t *tid;
    tid = (pthread_t *) arg;
    /* do something with tid , say printf?*/
    /*
    printf("The thread ID is %lu\n", *tid);
    */
    return NULL;
}

int main(int argc, char **argv)
{
    int rc;
    pthread_t tid;
    rc = pthread_create(&tid, NULL, pt_routine, &tid);
    if (rc)
    {
        return 1;
    }
    printf("The new thread is %lu\n", tid);
    pthread_join(tid, NULL);
    return 0;
}

Can the routine ALWAYS get the right tid?

Of course I could use pthread to fetch self ID but I just wonder when does the routine run.


Solution

  • Well, there are actually 2 questions:

    • which thread will execute first
    • will the thread id be saved before the new thread starts.

    This answer concerns Linux, as I don't have any other platforms available. The answer to the first question can be found in the manuals:

    Unless real-time scheduling policies are being employed, after a call to pthread_create(), it is indeterminate which thread—the caller or the new thread—will next execute.

    So it is clear that in your case, it is indeterminate which thread will actually run first. Now, another question is how is pthread_create implemented - if it could somehow create a dormant thread, storing its id first, and then later starting it?

    Well, linux creates the new thread using the clone system call:

    clone(child_stack=0x7f7b35031ff0, 
          flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM
              |CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
          parent_tidptr=0x7f7b350329d0,
          tls=0x7f7b35032700,
          child_tidptr=0x7f7b350329d0) = 24009
    

    Now, it seems that the thread id is stored with a pointer from the clone call, but it seems clear that child_tidptr doesn't refer to the address of tid, as if I print it, the the address is different; this is some internal variable within the pthread library; and tid would be updated after the clone system call returns in the parent thread.

    And indeed, pthread_self says the following:

    The thread ID returned by pthread_self() is not the same thing as the kernel thread ID returned by a call to gettid(2).

    This confirms that the kernel thread ids are distinct from pthread_ts

    Thus, in addition to this not being supported by the POSIX spec, there is no such guarantee on the Linux platform in practice - the tid will need to be set in the parent thread after clone returns, otherwise the parent wouldn't immediately know the thread id of the child - but this also means that if the child is the first to execute after the return, then the thread id might not be set there yet.