machine-learning g++pthreads fork loss-function

Are there alternatives to pthreads for linux for parallel execution and memory sharing?

I wrote a c++ linux application that used the pthread library. But it didn't work for me because instead of launching 100 threads it started only 98 threads: pthread_join Segmentation fault with 100 threads Is there an alternative other than 'fork' to parallelize my code? An advantage of threads was in the fact that I had all the global variable shared and I could place mutex where I had to write a shared variable.

Solution

Is there an alternative other than 'fork' to parallelize my code?

There is std::thread, which C++ people around here anyway tend to tell people to use instead of pthreads. However, that is highly likely to be implemented on top of a lower-level thread library, most likely pthreads on a system that offers pthreads in the first place.

There is also OpenMP, but this is again a wrapper around lower-level threading mechanisms.

The only readily usable alternative to parallelizing via multiple threads is parallelizing via multiple processes, which is what I take you to mean by your reference to fork.

An advantage of threads was in the fact that I had all the global variable shared and I could place mutex where I had to write a shared variable.

It is possible to share memory among multiple processes and to have mutexes that are shared among processes. That's a little trickier than just using a regular shared variable, but not so much so. The mechanisms for this are called "shared memory", and in the POSIX world there are two flavors: older, so-called System V shared memory segments, and newer POSIX shared memory.

May I suggest, however, that a better solution might be simply to reduce the number of threads. 100 threads is hugely excessive for parallel computation on most machines because your true concurrency is limited by the number of execution units (cores) the machine has. More threads than that may make some sense if you expect them to regularly block on I/O (on different files) for a significant time, but even then 100 is probably beyond the threshhold of reasonable. If you have more threads contending for execution time than you have execution units on which to schedule them then you are probably getting worse performance than you would with fewer threads.