Search code examples
linux-kernelpthreads

Linux tgkill(), is it really possible to see two threads with the same tid?


On Ubuntu Linux 20.04.4 (Linux kernel v5.13), man tgkill says:

int tgkill(int tgid, int tid, int sig);

tgkill() sends the signal sig to the thread with the thread ID tid in the thread group tgid.

My question is, can the system really have two threads with the same tid(acquired by gettid()), at a single moment?

If it can't, then why tgkill() force the user to provide the tgid parameter? The system should be able to query the corresponding tgid from a specific tid by himself.

BTW: I know that, to query tgid from tid (for example, 1554725) manually, we can cat /proc/1554725/status and grab the Tgid field.

The above statement can be verified using threadtid.cpp below:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <pthread.h>

void* threadFunc(void *arg)
{
    (void)arg;
    int child_tid = (int)gettid();
    printf("Child:  tid=%d\n", child_tid);
    sleep(30);
    return nullptr;
}

int main(int argc, char *argv[])
{
    pthread_t t1;
    int err = pthread_create(&t1, nullptr, threadFunc, nullptr);
    if (err != 0)
        exit(4);

    int parent_tid = (int)gettid();
    
    printf("Parent: tid=%d\n", parent_tid);

    void* childres = 0;
    err = pthread_join(t1, &childres);
    if (err != 0)
        exit(4);

    printf("Done.\n");
    exit(EXIT_SUCCESS);
}

We see that:

Name:   threadtid.out   
Umask:  0002            
State:  T (stopped)     
Tgid:   1554724  
Ngid:   0        
Pid:    1554725          
PPid:   37645      
...           

enter image description here


Solution

  • can the system really have two threads with the same tid(acquired by gettid()), at a single moment?

    No. Thread IDs are unique in the system at any given time, but they do get recycled.

    Thread group IDs (and process IDs) are a subset of thread IDs (see Relation between Thread ID and Process ID), chosen situationally: the TID of the first thread of a new process is assigned as that thread's TGID and that process's PID. When new threads are created in any other context, they inherit the PID and TGID of the thread that created them. The distinction between PID and TGID suggests that Linus* may have anticipated allowing a process to have multiple thread groups, but to date, that has not been implemented.

    why tgkill force the user to provide the tgid parameter? The system should be able to query the corresponding tgid from a specific tid by himself.

    The tgid parameter is not for identifying the thread to kill, per se. A TID would be enough for the system to recognize a thread to signal, if it exists. The TGID is to minimize the likelihood of the program signaling the wrong thread as a result of the one it meant having terminated, and its TID being recycled for a new thread.

    With the TGID, that can only happen if the process to which that TGID belongs generates a very large number of threads itself. Without, it can happen because of the behavior of other processes on the system, with no special privileges required of those other processes. I can imagine that as an exploitable weakness. I'm uncertain whether an actual exploit was ever created, but I note that Linux has an obsolete syscall TKILL that in fact does the same thing as tgkill() without requiring the TGID to be specified. Linux rarely, if ever, actually removes syscalls, but TKILL should not be used, and Glibc does not provide a wrapper function for it.


    *This is all Linux-specific.