Search code examples
c++sortinginterpolationstdstdvector

C++ Efficient interpolation of a std::vector


I need to find the value of a function given its unknown by interpolation. The problem is that the one I created is way too inefficient.

Firstly, I read a data file that contains both y=g(T) and T, but in discrete form. And I store their values in a std::vector<double>.

After this, I convert T (std::vector<double> Tgdat) to x (std::vector<double> xgdat). This will be the x-axis that will accompany the y-axis, (std::vector<double> gdat).

Then, I create a function to interpolate my vector std::vector<double> gdat, so that, given some x (which its value is in between two elements of the vector std::vector<double> xgdat), the program can spit some value for g(x). This function receives the vectors by reference, not because I want to modify them (that's why I also pass them as const), but so that the computer doesn't have to create copies of it.

double geffx (double x, const std::vector<double> &gdat, const std::vector<double> &xgdat)
{
  //Local variables
  double g;
  int k,l;

  //Find the index of the element of xgdat that is nearest to x
  auto i = min_element(xgdat.begin(), xgdat.end(),
      [x] (double a, double b)
      {
          return abs(x-a)<abs(x-b);
      });
  k = std::distance(xgdat.begin(), i); //Nearest index

  //Find the index of the element of xgdat that is nearest to x
  //and it is not the same index as before
  auto j = min_element(xgdat.begin(), xgdat.end(),
      [x,&xgdat,k] (double a, double b)
      {
          if (a!=xgdat[k]) return abs(x-a)<abs(x-b);
          else return false;
      });
  l = std::distance(xgdat.begin(), j); //Second nearest index

  //Interpolation:
  if(xgdat[k]<xgdat[l]) 
      g = gdat[k]+(x-xgdat[k])*(gdat[l]-gdat[k])/(xgdat[l]-xgdat[k]);
  else 
      g = gdat[l]+(x-xgdat[l])*(gdat[k]-gdat[l])/(xgdat[k]-xgdat[l]);

  return g;
}

This seems to be highly inefficient, but I can't wrap my head around something that solves this same problem but in a more efficient way. I've tried the const thing and also passing by reference, but I guess the big problem is the min_element() function/method, and maybe it also has to do with the if-else at the end to return the value of g.


Edit: Extra info

I'm using g++ as the compiler, the number of elements is 275.

Since this function is part of an EDO solver it's called in every step (1e4 steps) multiple times until it converges. I need 4 interpolators which are called more than once each for its evaluation, so I would say that the function needs to be accessed more than 1e6 times.

Back when I substituted g(x) with a constant (no need of interpolators) the execution time was around 1-10 seconds. Now it's 45 mins - 1 hour. (very bad, I know, that's why I need help)


Solution

  • Lose min_element, lose the absolute distance comparison. It's a convex transformation which can be searched efficiently, but not by any function that exists in the C++ standard library.

    You don't want the two closest points anyway, you want to bracket your evaluation above and below. (The only case where "closest two" always gives a bracketing pair is when the samples are uniformly spaced, and if that's true you don't need a search at all, you can calculate the index directly using the spacing interval).

    Use lower_bound to do an efficient binary search in the sorted array for the x you care about. That's one side of your bracket, the other index is one lower.

    In the end the top portion of your code will look something like:

    //Find the index of the element of xgdat that is nearest-above to x
    auto i = lower_bound(xgdat.begin(), xgdat.end(), x); 
    //If the vector values are in decreasing order use:
    //auto i = lower_bound(xgdat.rbegin(), xgdat.rend(), x);
    k = i - xgdat.begin(); //Nearest index
    if (i == xgdat.end())
      --k;  // extrapolating above
    else if (*i == x)
      return gdat[k];
    
    l = k? k - 1: 1; //nearest-below index, except when extrapolating downward
    
    // proceed with linear interpolation/extrapolation using l and k