Speedup sympy-lamdified and vectorized function

I am using sympy to generate some functions for numerical calculations. Therefore I lambdify an expression an vectorize it to use it with numpy arrays. Here is an example:

import numpy as np
import sympy as sp

def numpy_function():
    x, y, z = np.mgrid[0:1:40*1j, 0:1:40*1j, 0:1:40*1j]
    T   = (1 - np.cos(2*np.pi*x))*(1 - np.cos(2*np.pi*y))*np.sin(np.pi*z)*0.1
    return T

def sympy_function():
    x, y, z = sp.Symbol("x"), sp.Symbol("y"), sp.Symbol("z")
    T   = (1 - sp.cos(2*sp.pi*x))*(1 - sp.cos(2*sp.pi*y))*sp.sin(sp.pi*z)*0.1
    lambda_function = np.vectorize(sp.lambdify((x, y, z), T, "numpy"))
    x, y, z = np.mgrid[0:1:40*1j, 0:1:40*1j, 0:1:40*1j]
    T = lambda_function(x,y,z)
    return T

The problem between the sympy version and a pure numpy version is the speed i.e.

In [3]: timeit test.numpy_function()  
100 loops, best of 3: 11.9 ms per loop

vs.

In [4]: timeit test.sympy_function()
1 loops, best of 3: 634 ms per loop

So is there any way to get closer to the speed of the numpy version ? I think np.vectorize is pretty slow but somehow some part of my code does not work without it. Thank you for any suggestions.

EDIT: So I found the reason why the vectorize function is necessary, i.e:

In [35]: y = np.arange(10)

In [36]: f = sp.lambdify(x,sin(x),"numpy")

In [37]: f(y)
Out[37]: 
array([ 0.        ,  0.84147098,  0.90929743,  0.14112001, -0.7568025 ,
       -0.95892427, -0.2794155 ,  0.6569866 ,  0.98935825,  0.41211849])

this seems to work fine however:

In [38]: y = np.arange(10)

In [39]: f = sp.lambdify(x,1,"numpy")

In [40]: f(y)
Out[40]: 1

So for simple expression like 1 this function doesn't return an array. Is there a way to fix this and isn't this some kind of bug or at least inconsistent design?

Solution

lambdify returns a single value for constants because no numpy functions are involved. This is because of the way lambdify works (see https://stackoverflow.com/a/25514007/161801).

But this is typically not a problem because a constant will automatically broadcast to the correct shape in any operation that you use it in with an array. On the other hand, if you explicitly worked with an array of the same constant, it would be much less efficient because you would compute the same operations multiple times.