I would like to take make a matrix that has all its entries as a function of some variable x
. Hence B(x)
would give N x N
output ideally in a fast manner. This is a simple task in fact if you are willing to type out the matrix with functions as entries. As an example:
f1 = lambda x: 2*x
f2 = lambda x: x**2
B = lambda x : np.array([[f1(x),f2(x)],
[f2(x),f1(x)]])
This is naive as it is not scale-able to a scenario where your array is large and has a variety of functions. Typing it out would take a long time for a single problem. One would usually create an empty array and use two Python for loops to compute the specific function for a given entry and then deposit the output in the array. The array is then returned.
The problem with the above method is that each time the function is called, it runs those for loops. This makes it slow if you want to run the function on a dataset of x
values. I've tried to create a static callable array using Sympy's lambdfiy
function. For an evaluation of x
it seems faster than the for loop solution in pure Python. However, this is horribly outweighed by the setup cost. Please see my code below for the details.
Is there a way to use the vectorize
function in Numpy to speed things up? Can you perhaps find a solution that is faster than the for loop version?
I also do play around with the idea (or call it a dream) where one can evaluate the entire dataset of X instead of each x separately. Like broadcasting in Numpy. E.g.
# Naive
result1 = [np.sin(x) for x in X]
# vs
results2 = np.sin(X)
Anyways, that's quite far-fetched.
Here is the code I have written. Please do play around with the size of N
to see how fascinating the decline in speed is. Just to clarify, I have evaluated my program as a whole and this issue of the callable array is where the bottleneck lies.
import numpy as np
from sympy import symbols,lambdify,zeros
from time import time
def get_sympy(N):
'''
Creates a callable array using Sympys lambdfiy capabilites.
This is truly a callable array.
'''
x = symbols('x')
output = zeros(N,N)
for i in range(N):
for j in range(N):
if i == j:
output[i,j] = x**2
elif i == 1:
output[i,j] = x**3
elif j == 0:
output[i,j] = x**4
else:
output[i,j] = x
return lambdify(x,output,'numpy')
def get_python(x,N):
'''
Uses Python loops to setup an array that mimic that of a callable array.
It is not truly a callable array as the loops run on each call of
this function.
'''
output = np.zeros((N,N))
f1 = lambda x: x**2
f2 = lambda x: x**3
f3 = lambda x: x**4
for i in range(N):
for j in range(N):
if i == j:
output[i,j] = f1(x)
elif i == 1:
output[i,j] = f2(x)
elif j == 0:
output[i,j] = f3(x)
else:
output[i,j] = x
return output
if __name__ == '__main__':
N = 30
X = np.random.uniform()
callable_sympy_array = get_sympy(N)
callable_python_array = lambda x: get_python(x,N)
t1 = time()
sympy_result = callable_sympy_array(X)
t2 = time()
python_result = callable_python_array(X)
t3 = time()
sympy_func = get_sympy(N)
t4 = time()
sympy_time = t2-t1
python_time = t3-t2
sympy_setup_time = t4-t3
print('\nSingle Evaluation Results:\n')
print('Sympy: ',round(sympy_time,5))
print('Python: ',round(python_time,5))
print('Sympy + Setup',round(sympy_setup_time,5))
evals = 100
print('\nResults for {0} evaluations of a {1} by {1} array:\n'.format(evals,N))
print('Sympy: ',sympy_setup_time + evals*sympy_time)
print('Python: ',python_time*evals)
Fast numpy
evaluation requires applying the built-in compiled operators/functions to whole arrays. Any sort of python level iteration slows you down, as does evaluating (general) Python functions on scalars. The fast stuff is mostly limited to the operators (like **
) and ufunc
(np.sin
, etc).
Your sympy
generated function illustrates this:
In an isympy
session:
In [65]: M = get_sympy(3)
using the ipython
code introspection:
In [66]: M??
Signature: M(x)
Docstring:
Created with lambdify. Signature:
func(x)
Expression:
Matrix([[x**2, x, x], [x**3, x**2, x**3], [x**4, x, x**2]])
Source code:
def _lambdifygenerated(x):
return (array([[x**2, x, x], [x**3, x**2, x**3], [x**4, x, x**2]]))
Imported modules:
Source:
def _lambdifygenerated(x):
return (array([[x**2, x, x], [x**3, x**2, x**3], [x**4, x, x**2]]))
File: /<lambdifygenerated-8>
Type: function
So this a function in x
, using numpy
operations, the **
operator, and array creation. Just as though you had typed it in. sympy
creates this by lexical substitutions in its symbolic code, so you can say it does 'type it in'.
It can operate on a scalar
In [67]: M(3)
Out[67]:
array([[ 9, 3, 3],
[27, 9, 27],
[81, 3, 9]])
on an an array, here producing a (3,3,3) result:
In [68]: M(np.arange(1,4))
Out[68]:
array([[[ 1, 4, 9],
[ 1, 2, 3],
[ 1, 2, 3]],
[[ 1, 8, 27],
[ 1, 4, 9],
[ 1, 8, 27]],
[[ 1, 16, 81],
[ 1, 2, 3],
[ 1, 4, 9]]])
I expect that it is easy to write a sympy
expression that when translated, can't take array argument(s). if
tests are notoriously hard to write in an array compatible form, since a Python if
expression only works on scalar booleans.
Your get_python
won't take an array x
, primarily because the
output = np.zeros((N,N))
has a fixed size; using np.zeros((N,N)+x.shape), x.dtype)
might get around that.
In any case, it will be slow because of the python level iteration at each call.
===
It would be faster if you tried to assign groups of elements. For example in this case:
In [76]: output = np.zeros((3,3),int)
In [77]: output[:] = 3
In [78]: output[:,0]=3**4
In [79]: output[1,:]=3**3
In [80]: output[np.arange(3),np.arange(3)]=3**2
In [81]: output
Out[81]:
array([[ 9, 3, 3],
[27, 9, 27],
[81, 3, 9]])
===
frompyfunc
is a handy tool for cases like this. In some cases it offers a 2x speed improvement over direct iteration. But even without that it can make the code more concise.
For example, a quick write up of your example:
In [82]: def foo(x,i,j):
...: if i==j: return x**2
...: if i==1: return x**3
...: if j==0: return x**4
...: return x
In [83]: f = np.frompyfunc(foo, 3, 1)
In [84]: f(3,np.arange(3)[:,None], np.arange(3))
Out[84]:
array([[9, 3, 3],
[27, 9, 27],
[81, 3, 9]], dtype=object)
and for the Out[68]
example:
In [98]: f(np.arange(1,4),np.arange(3)[:,None,None], np.arange(3)[:,None]).shape
Out[98]: (3, 3, 3)
In [99]: timeit f(np.arange(1,4),np.arange(3)[:,None,None], np.arange(3)[:,None]).shape
23 µs ± 471 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [100]: timeit M(np.arange(1,4))
21.7 µs ± 440 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Evaluated on a scalar x
, my f
is about the same speed as your get_python
.
In [115]: MM = get_sympy(30)
In [116]: timeit MM(3)
109 µs ± 112 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [117]: timeit get_python(3,30)
241 µs ± 2.06 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [118]: timeit f(3,np.arange(30)[:,None], np.arange(30)).astype(int)
254 µs ± 1.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)