I need to find the value of a function given its unknown by interpolation. The problem is that the one I created is way too inefficient.
Firstly, I read a data file that contains both y=g(T) and T, but in discrete form. And I store their values in a std::vector<double>
.
After this, I convert T (std::vector<double> Tgdat
) to x (std::vector<double> xgdat
). This will be the x-axis that will accompany the y-axis, (std::vector<double> gdat
).
Then, I create a function to interpolate my vector std::vector<double> gdat
, so that, given some x (which its value is in between two elements of the vector std::vector<double> xgdat
), the program can spit some value for g(x). This function receives the vectors by reference, not because I want to modify them (that's why I also pass them as const
), but so that the computer doesn't have to create copies of it.
double geffx (double x, const std::vector<double> &gdat, const std::vector<double> &xgdat)
{
//Local variables
double g;
int k,l;
//Find the index of the element of xgdat that is nearest to x
auto i = min_element(xgdat.begin(), xgdat.end(),
[x] (double a, double b)
{
return abs(x-a)<abs(x-b);
});
k = std::distance(xgdat.begin(), i); //Nearest index
//Find the index of the element of xgdat that is nearest to x
//and it is not the same index as before
auto j = min_element(xgdat.begin(), xgdat.end(),
[x,&xgdat,k] (double a, double b)
{
if (a!=xgdat[k]) return abs(x-a)<abs(x-b);
else return false;
});
l = std::distance(xgdat.begin(), j); //Second nearest index
//Interpolation:
if(xgdat[k]<xgdat[l])
g = gdat[k]+(x-xgdat[k])*(gdat[l]-gdat[k])/(xgdat[l]-xgdat[k]);
else
g = gdat[l]+(x-xgdat[l])*(gdat[k]-gdat[l])/(xgdat[k]-xgdat[l]);
return g;
}
This seems to be highly inefficient, but I can't wrap my head around something that solves this same problem but in a more efficient way. I've tried the const
thing and also passing by reference, but I guess the big problem is the min_element()
function/method, and maybe it also has to do with the if-else at the end to return the value of g.
Edit: Extra info
I'm using g++
as the compiler, the number of elements is 275.
Since this function is part of an EDO solver it's called in every step (1e4 steps) multiple times until it converges. I need 4 interpolators which are called more than once each for its evaluation, so I would say that the function needs to be accessed more than 1e6 times.
Back when I substituted g(x) with a constant (no need of interpolators) the execution time was around 1-10 seconds. Now it's 45 mins - 1 hour. (very bad, I know, that's why I need help)
Lose min_element
, lose the absolute distance comparison. It's a convex transformation which can be searched efficiently, but not by any function that exists in the C++ standard library.
You don't want the two closest points anyway, you want to bracket your evaluation above and below. (The only case where "closest two" always gives a bracketing pair is when the samples are uniformly spaced, and if that's true you don't need a search at all, you can calculate the index directly using the spacing interval).
Use lower_bound
to do an efficient binary search in the sorted array for the x
you care about. That's one side of your bracket, the other index is one lower.
In the end the top portion of your code will look something like:
//Find the index of the element of xgdat that is nearest-above to x
auto i = lower_bound(xgdat.begin(), xgdat.end(), x);
//If the vector values are in decreasing order use:
//auto i = lower_bound(xgdat.rbegin(), xgdat.rend(), x);
k = i - xgdat.begin(); //Nearest index
if (i == xgdat.end())
--k; // extrapolating above
else if (*i == x)
return gdat[k];
l = k? k - 1: 1; //nearest-below index, except when extrapolating downward
// proceed with linear interpolation/extrapolation using l and k