I have the following for loop, which contains a private function call inside of it:
for (i = 0; i < N; ++i)
dates[i] = to_time_t(&string_dates[i][0]);
to_time_t
simply converts a string (e.g.: "18/03/2007"
) into a timestamp, and it does so with the help of mktime()
, which is really slow. In fact, that for loop alone takes the most time out of any other code in the program. To remedy this, I am trying to apply OpenMP to the loop, like this:
#pragma omp parallel for private(i)
for (i = 0; i < N; ++i)
dates[i] = to_time_t(&string_dates[i][0]);
My OpenMP knowledge is limited, but I'm assuming that each element of the dates
array is never accessed by two threads simultaneously since i
is private. The same should apply to string_dates
. But when I run this code, performance is actually worse, so I must be doing something wrong, I just don't see it. Any help is appreciated!
Edit: I should have included the to_time_t
code from the start.
time_t to_time_t(const string * date) {
struct std::tm tm = {0};
istringstream ss_tm(*date);
ss_tm >> get_time(&tm, "%m/%d/%Y");
return mktime(&tm);
}
The problem is in mktime()
which has a process-wide side effect. From the manual page:
Calling mktime() also sets the external variable tzname with information about the current timezone.
mktime()
calls internally tzset()
. The latter is serialised via a mutex lock, but what really slows it down in the multithreaded case is the constant cache trashing. When a call to tzset()
by a thread running on a particular CPU core writes to tzname
, this invalidates the caches of all the other cores, forcing threads that run on those cores to access higher cache levels or even the main memory the next time there is a call to mktime()
.
You need to find or write an equivalent of mktime()
that doesn't modify global state. Or just stick to sequential execution for that part of the code. It is perfectly fine to call mktime()
simultaneously in multiple sequential processes (e.g., in a pure MPI application).