Search code examples
cloopsoptimizationconditional-statements

Optimisation of loops containing conditionals with expensive math functions


I have a loop like:

#pragma omp parallel for num_threads(threads) schedule(static)
for (int i = 0; i < ndata; i++) {
    result[i] = calculate(data[i]);
}

with (a simplified version of) the function calculate() being:

double calculate(double in) {
    if (in < LP) {
        out = c.b1 * in;
    } else if (in < SP) {
        out = c.a2 + c.b2 * pow((c.c2 + c.d2 * in), c.e2);
    } else if (in < HP) {
        out = c.a3 + c.b3 * pow((c.c3 + c.d3 * in), c.e3);
    } else {
        out = c.a4 + c.b4 * in;
    }
    return out;
}

All calculation variables are double. It's an image processing routine so ndata can be 3 x number of pixels, or for modern cameras ~1E8, and I'm trying to make the routine as responsive as possible. The calculation needed is either simple addition / multiplication or a more expensive call to pow(), depending on the subpixel value being processed. I've already done a lot of precalculation outside the loop and I'm using OpenMP to handle parallelising of the loop, but is there anything more I can do to optimise this? I'm guessing it won't auto-vectorise particularly well given that for n successive passes round the loop you might have a mix of pow() and simple calculations.


Solution

  • Consider using arrays in your struct instead of name-numbered members. That will allow you to do something like:

    for (int i = 0; i < ndata; i++){
      size_t j = 0;
      j += (data[i] >= LP);
      j += (data[i] >= SP);
      j += (data[i] >= HP);
      result[i] = c.a[j] + c.b[j] * 
      pow((c.c[j] + c.d[j] * data[i]), c.e[j]);
    }
    

    Then just populate those arrays with 0.0f and 1.0f as appropriate to make the function work.

    From there it's just a matter of optimizing a specialized pow function for inlining and vectorization. As a bonus this should operate in constant time as long as your pow function does but at the cost of possibly unnecessary calculations for a good portion of the data - whether it's worth it or not will depend on the data set.