Search code examples
multithreadingthread-safetypthreadsthread-local

Can I assign a per-thread index, using pthreads?


I'm optimizing some instrumentation for my project (Linux,ICC,pthreads), and would like some feedback on this technique to assign a unique index to a thread, so I can use it to index into an array of per-thread data.

The old technique uses a std::map based on pthread id, but I'd like to avoid locks and a map lookup if possible (it is creating a significant amount of overhead).

Here is my new technique:

static PerThreadInfo info[MAX_THREADS]; // shared, each index is per thread

// Allow each thread a unique sequential index, used for indexing into per
// thread data.
1:static size_t GetThreadIndex()
2:{
3:   static size_t threadCount = 0;
4:   __thread static size_t myThreadIndex = threadCount++;
5:   return myThreadIndex;
6:}

later in the code:

// add some info per thread, so it can be aggregated globally
info[ GetThreadIndex() ] = MyNewInfo();

So:

1) It looks like line 4 could be a race condition if two threads where created at exactly the same time. If so - how can I avoid this (preferably without locks)? I can't see how an atomic increment would help here.

2) Is there a better way to create a per-thread index somehow? Maybe by pre-generating the TLS index on thread creation somehow?


Solution

  • 1) An atomic increment would help here actually, as the possible race is two threads reading and assigning the same ID to themselves, so making sure the increment (read number, add 1, store number) happens atomically fixes that race condition. On Intel a "lock; inc" would do the trick, or whatever your platform offers (like InterlockedIncrement() for Windows for example).

    2) Well, you could actually make the whole info thread-local ("__thread static PerThreadInfo info;"), provided your only aim is to be able to access the data per-thread easily and under a common name. If you actually want it to be a globally accessible array, then saving the index as you do using TLS is a very straightforward and efficient way to do this. You could also pre-compute the indexes and pass them along as arguments at thread creation, as Kromey noted in his post.