My optimization task deals with calculation of the following integral and finding the best values of xl
and xu
:
Iterations take too long, so I decided to speed them up by calculating integral for all possible values xl
and xu
and then interpolate calculated values during optimization.
I wrote the following function:
def k_integrand(x, xl, xu):
return((x**2)*mpmath.exp(x))/((xu - xl)*(mpmath.exp(x)-1)**2)
@np.vectorize
def K(xl, xu):
y, err = integrate.quad(k_integrand, xl, xu, args = (xl, xu))
return y
and two identical arrays grid_xl
and grid_xu
with dinamic increment of values.
When I run the code I get this:
K(grid_xl, grid_xu)
Traceback (most recent call last):
File "<ipython-input-2-5b9df02f12b7>", line 1, in <module>
K(grid_xl, grid_xu)
File "C:/Users/909404/OneDrive/Работа/ZnS-FeS/Теплоемкость/Python/CD357/4 - Optimization CD357 interpolation.py", line 75, in K
y, err = integrate.quad(k_integrand, xl, xu, args = (xl, xu))
File "C:\Users\909404\Anaconda3\lib\site-packages\scipy\integrate\quadpack.py", line 323, in quad
points)
File "C:\Users\909404\Anaconda3\lib\site-packages\scipy\integrate\quadpack.py", line 372, in _quad
if (b != Inf and a != -Inf):
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
I guess it comes from the fact that xl
should be always less than xu
.
Is there any way to compare the values of xl
and xu
and return NaN in case if xl>=xu
?
In the end I want to have something like this:
And to have the ability to use interpolation.
Maybe I have chosen the wrong way? I'd appreciate any help.
I can't reproduce your error unless I omit the np.vectorize
decorator. Setting xl
/xu
values that coincide does give me a ZeroDivisionError
though.
Anyway, there's nothing stopping you from checking the values of xu
vs xl
in your higher-level function. That way you can skip integration entirely for nonsensical data points and return np.nan
early:
import numpy as np
import mpmath
import scipy.integrate as integrate
def k_integrand(x, xl, xu):
return ((x**2)*mpmath.exp(x))/((xu - xl)*(mpmath.exp(x)-1)**2)
@np.vectorize
def K(xl, xu):
if xu <= xl:
# don't even try to integrate
return np.nan
y, err = integrate.quad(k_integrand, xl, xu, args = (xl, xu))
return y
grid_xl = np.linspace(0.1,1,10) # shape (10,) ~ (1,10)
grid_xu = np.linspace(0.5,4,8)[:,None] # shape (8,1)
With these definitions I get (following np.set_printoptions(linewidth=200)
for easier comparison:
In [35]: K(grid_xl, grid_xu)
Out[35]:
array([[0.99145351, 0.98925197, 0.98650808, 0.98322919, nan, nan, nan, nan, nan, nan],
[0.97006703, 0.96656815, 0.96254363, 0.95800307, 0.95295785, 0.94742104, 0.94140733, 0.93493293, 0.9280154 , nan],
[0.93730403, 0.93263063, 0.92745487, 0.92178832, 0.91564423, 0.90903747, 0.90198439, 0.89450271, 0.88661141, 0.87833062],
[0.89565597, 0.88996696, 0.88380385, 0.87717991, 0.87010995, 0.8626103 , 0.85469862, 0.84639383, 0.83771595, 0.82868601],
[0.84794429, 0.8414176 , 0.83444842, 0.82705134, 0.81924245, 0.81103915, 0.8024601 , 0.79352503, 0.7842547 , 0.77467065],
[0.79692339, 0.78974 , 0.78214742, 0.77416128, 0.76579857, 0.75707746, 0.74801726, 0.73863822, 0.72896144, 0.71900874],
[0.7449893 , 0.73732055, 0.7292762 , 0.72087263, 0.71212741, 0.70305921, 0.69368768, 0.68403329, 0.67411725, 0.66396132],
[0.69402415, 0.68602325, 0.67767956, 0.66900991, 0.66003222, 0.65076537, 0.6412291 , 0.63144388, 0.62143077, 0.61121128]])
You can see that the values perfectly agree with your linked image.
Now, I've got bad news and good news. The bad news is that while np.vectorize
provides syntactical sugar around calling your scalar integration function with array inputs, it won't actually give you speed-up compared to a native for loop. The good news is that you can replace the calls to mpmath.exp
with calls to np.exp
and you'll end up with the same result much faster:
def k_integrand_np(x, xl, xu):
return ((x**2)*np.exp(x))/((xu - xl)*(np.exp(x)-1)**2)
@np.vectorize
def K_np(xl, xu):
if xu <= xl:
# don't even try to integrate
return np.nan
y, err = integrate.quad(k_integrand_np, xl, xu, args = (xl, xu))
return y
With these definitions
In [14]: res_mpmath = K(grid_xl, grid_xu)
...: res_np = K_np(grid_xl, grid_xu)
...: inds = ~np.isnan(res_mpmath)
...:
In [15]: np.array_equal(res_mpmath[inds], res_np[inds])
Out[15]: True
In [16]: %timeit K(grid_xl, grid_xu)
107 ms ± 521 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [17]: %timeit K_np(grid_xl, grid_xu)
7.26 ms ± 157 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
So the two methods give the same result (exactly!), but the numpy version is almost 15 times faster.