Search code examples
linuxglibcpthread-join

Wrong exit value from pthread_exit


Below code simply creates two threads and tries to get return values of them.

I've compiled and run it on a 32-bit glibc-2.15 system and all went right (output: r1: 1, r2: 2). However when I did same thing on a 64-bit glibc-2.17 system, output was wrong (output: r1: 0, r2: 2). Why the same code behaves differently on different systems?

Note: If types of r1 and r2 are changed to void* or int* as commented below, code works on both systems.

#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
#include <string.h>

void* worker(void* arg) {
    int i = (int) arg;
    pthread_exit((void*)i);
}

int main(int argc, char** argv) {

    pthread_t tid[2];
    int err = 0;
    err = pthread_create(&tid[0], NULL, worker, (void*) 1);
    if(err != 0) printf("error: %s\n", strerror(err));
    err = pthread_create(&tid[1], NULL, worker, (void*) 2);
    if(err != 0) printf("error: %s\n", strerror(err));

    ///*
    int r1 = 0, r2 = 0; // <-- WRONG: r1: 0, r2: 2
    //void *r1, *r2; // <-- OK: r1: 1, r2: 2
    pthread_join(tid[0], (void**) &r1);
    pthread_join(tid[1], (void**) &r2);
    printf("r1: %d, r2: %d\n", (int) r1, (int) r2);
    //*/

    // make comment above snippet and uncomment below snippet: // <-- OK: r1: 1, r2: 2 
    /*
    int *r1 = (int*) malloc(sizeof(int));
    int *r2 = (int*) malloc(sizeof(int));
    pthread_join(tid[0], (void**) r1);
    pthread_join(tid[1], (void**) r2);
    printf("r1: %d, r2: %d\n", (int)(*r1), (int)(*r2));
    */
    return 0;
}

Solution

  • Short answer: on a 64-bit system, sizeof(void*) != sizeof(int), and by passing &int into pthread_join you are invoking undefined behavior (and corrupting stack; running that variant of the program under Address Sanitizer should detect the error).

    In the case where you pass &int, but the int was heap-allocated, you are corrupting heap instead, but you don't notice that yet (your program is likely to crash later on subsequent malloc or free). Running that variant of the program under Valgrind or Address Sanitizer should trivially prove heap corruption to you.

    Longer answer: pthread_join(tid, &x) essentially performs this:

    memcpy(&x, previosly_used_pthread_exit_value, sizeof(void*));
    

    It should now be clear that passing in address of any variable for which sizeof(x) < sizeof(void*) invokes undefined behavior.