Search code examples
pythonnumpyarray-broadcasting

Issue with Numpy brodcasting: implement "two-steps" broadcasting


I would like to compute a Fourier series like

Fourier series

(sorry had to copy a snapshot seems LaTex is not available on Stack Overflow), over a np.meshgrid.

Let us say I define some global variables

TERMS = 5
PERIOD = 1
A_0 = 1
c_nm = np.random.rand(TERMS,TERMS)
n_range = np.arange(0,TERMS, dtype = 'int')
m_range = np.arange(0,TERMS, dtype = 'int')
nn, mm = np.meshgrid(n_range, m_range)

and now, for fixed x and y , I could compute the value of the series in one line

x = 0
y = 0
A_0 + (c_nm*np.sin(nn*x/PERIOD)*np.sin(nn*y/PERIOD)).sum()

the arrays c_nm, nn and mm have all shape (TERMS, TERMS) and I think it works by "vectorising" over said matrices, before I take the sum.

But now, how to wrap this in a function that could vectorise a second time, over a x,y grid?

If I try

x = np.linspace(0,1,10)
y = np.linspace(0,1,10)
xx, yy = np.meshgrid(x,y)
def fourier_series(x,y):
    return A_0 + (c_nm*np.sin(nn*x/PERIOD)*np.sin(nn*y/PERIOD)).sum()

and then try

fourier_series(xx,yy)

attempting to compute over the grid at once, I get the error (understandably)

operands could not be broadcast together with shapes (5,5) (10,10) 

Now I think I see what is going on, but this is not of course what I am looking for. The computation over the arrays c_nm, nn and mm should be somehow done first, as an inner step, and only then the fourier_series(xx,yy) should operate on the x,y grid.

Of course, I would also like not to have global variables such as TERMS, nn, mm lying around and I would like TERMS to be an input to the function and nn, mm built internally, but I am stuck even before this.

Any hints please? Thanks


Solution

  • You want to compute 4 implicit nested rows. So you need 4 axes.

    Pseudo code of what you want

    for y in np.linspace(0,1,10):
        for x in np.linspace(0,1,10):
            A[y,x] = A_0
            for n in range(TERMS):
                for m in range(TERMS):
                    A[y,x] += c_nm*np.sin(nn*x/PERIOD)*np.sin(nn*y/PERIOD)
    

    So, for that just decide which axis to use for each variable, and use broadcasting.

    Note that there is no need of meshgrid for that (there is almost never a need for meshgrid; 95% of tutorials on internet that use meshgrid shouldn't. Broadcasting does the job. And with less memory and cpu usage most of the time).

    y=np.linspace(0,1,10)[:,None,None,None]
    x=np.linspace(0,1,10)[None,:,None,None]
    n=np.arange(TERMS)[None,None,:,None]
    m=np.arange(TERMS)[None,None,None,:]
    A=A_0 + (c_nm[None,None,:,:]*np.sin(m*x/PERIOD)*np.sin(n*y/PERIOD)).sum(axis=(2,3))
    

    Note that here, if we had used meshgrid (which would be possible, just meshgrid(y,x,n_range,m_range) from your variables would create 4d-grids), then all intermediary result would have been 4d, and all intermediary computation would have been done in 4d.) For example m*x would have been TERMS×TERMS×10×10 multiplication. When here, it is only TERMS×10 multiplications. Then TERMS×10 division by period instead of TERMS×TERMS×10×10. Then TERMS×10 sin. etc.

    Likewise, in your single x,y working version, your usage of meshgrid for nn and mm means that np.sin(nn*x/PERIOD) computes TERMS² multiplication, then TERMS² divisions, then TERMS² sin. When they are redundant, since nn contains TERMS times the same thing. With broadcasting, those operations are done only TERMS times (the strict minimum needed, for all TERMS different values). And int would be only when combining both sin and cos (for which there are TERMS² combinations of n and m values) that computations becomes 2D.

    So, bottom line: you almost never need meshgrid. Each time you want to use meshgrid, ask yourself if you couldn't achieve the same result with broadcast. And almost always, the answer is "yes, and it reduces the computation and memory".