I have a function that takes two axes as an input - a time axis and an energy axis - and I'm trying to figure out if there's a way to generate the resulting 2D array through broadcasting, rather than in a for loop for one of the axes.
The function looks like this, and is also included in my code example: diffusion equation
Here's what I think is the naïve approach I've tried, with the two axes have different lengths and handling the time array in a for loop:
import numpy as np
def function(zeta, tau, alpha):
return 1/np.sqrt(2*np.pi*alpha**2*tau)*np.exp(-zeta**2/(2*alpha**2*tau))
zeta = np.linspace(-2, 2, 100)
tau = np.logspace(1e1, 1e9, 200)
alpha = 1e-6
res = np.zeros((len(tau), len(zeta)))
for i, t in enumerate(tau):
res[i, :] = function(zeta, t, alpha)
But what I'd like to do is this:
res = function(zeta, tau, alpha)
Which gives the (I guess expected) error:
return 1/np.sqrt(2*np.pi*alpha**2*tau)*np.exp(-zeta**2/(2*alpha**2*tau))
ValueError: operands could not be broadcast together with shapes (100,) (200,)
So is there a way to simultaneously broadcast the function across zeta
and tau
, and speed up the building of the 100x200 array res
?
First, a comment on np.logspace
: I believe you may want to use tau = np.logspace(1, 9, 200)
rather than tau = np.logspace(1e1, 1e9, 200)
. The docs say:
In linear space, the sequence starts at base ** start (base to the power of start) and ends with base ** stop
The default value of base
is 10.0, so start
and stop
of 1
and 9
would get a sequence from 10.0 ** 1
to 10.0 ** 9
which I believe may have been your intention.
Regarding the question about 2D broadcasting, I think this may be what you're looking for:
X, Y = np.meshgrid(zeta, tau)
result = function(X, Y, alpha)
The idea comes from this answer to a similar question.
UPDATE: This should give the same result and save some space:
X, Y = np.meshgrid(zeta, tau, sparse=True)
From the docs:
sparse coordinate grids are intended to be use with Broadcasting. When all coordinates are used in an expression, broadcasting still leads to a fully-dimensonal result array.
import numpy as np
def function(zeta, tau, alpha):
return 1/np.sqrt(2*np.pi*alpha**2*tau)*np.exp(-zeta**2/(2*alpha**2*tau))
zeta = np.linspace(-2, 2, 100)
tau = np.logspace(1, 9, 200)
alpha = 1e-6
X, Y = np.meshgrid(zeta, tau)
result = function(X, Y, alpha)
[print(v, eval(v+'.shape')) for v in ['zeta', 'tau', 'result']]
print(f"\nresult:\n{result}")
res = np.zeros((len(tau), len(zeta)))
for i, t in enumerate(tau):
res[i, :] = function(zeta, t, alpha)
print(f"\nres.shape {res.shape}")
print(f"\nres:\n{res}")
print(f"\nallclose()? {np.allclose(result, res)}")
Output:
zeta (100,)
tau (200,)
result (200, 100)
result:
[[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
...
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]]
res.shape (200, 100)
res:
[[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
...
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]]
allclose()? True
Performance: Timeit for your original approach and the one above shows:
Timeit results:
foo_1 ran in 0.001397762499982491 seconds using 1000 iterations
foo_2 ran in 0.004021247100026813 seconds using 1000 iterations
So meshgrid
looks to be about 3x faster for your inputs.
Full test code is here:
import numpy as np
def function(zeta, tau, alpha):
return 1/np.sqrt(2*np.pi*alpha**2*tau)*np.exp(-zeta**2/(2*alpha**2*tau))
zeta = np.linspace(-2, 2, 100)
tau = np.logspace(1, 9, 200)
alpha = 1e-6
from timeit import timeit
def foo_1(zeta, tau, alpha):
X, Y = np.meshgrid(zeta, tau)
result = function(X, Y, alpha)
return result
def foo_2(zeta, tau, alpha):
res = np.zeros((len(tau), len(zeta)))
for i, t in enumerate(tau):
res[i, :] = function(zeta, t, alpha)
return res
n = 1000
print(f'Timeit results:')
for foo in ['foo_' + str(i + 1) for i in range(2)]:
t = timeit(f"{foo}(zeta, tau, alpha)", setup=f"from __main__ import zeta, tau, alpha, {foo}", number=n) / n
print(f'{foo} ran in {t} seconds using {n} iterations')