Vectorizing the python cycle using NumPy

For the reason of my math modelling, I have a vector_U function

import numpy as np
def vector_U(U_0, t, func, dt):
    res = np.empty((len(t), 4))
    res[0] = U_0
    for i in range(1, len(t)):
        res[i] = res[i-1]+runge_kutta(res[i-1], t[i-1], func, dt)
    return res

where func is the function f([x, y, vx, vy]) = [vx, vy, ax, ay] of the differential equation and runge_kutta is

def runge_kutta(x, t, func, dt):
    k1 = dt*func(x, t)
    k2 = dt*func(x+k1/2, t+dt/2)
    k3 = dt*func(x+k2/2, t+dt/2)
    k4 = dt*func(x+k3, t+dt)
    return (k1+2*k2+2*k3+k4)/6

The purpose of the vector_U function is to take the 4-vector of starting conditions U_0 and t=np.linspace(0, t_final, t_final/dt) and return the array of the 4-vectors for each dt step. Then x = res[0] and y = res[1] columns are used for animation.

While it works fine, for a large t (e.g. len(t)==50000) the function obviously becomes slow just because it uses cycles. Since I am trying (instead of KISS, yes) make the program as much vectorized as possible to make it work faster, does the NumPy have any method to vectorize something like that? I don't mean the specific decision but the way to vectorize loops such as this because, I assume, I don't understand the vectorization method completely.

I thought about using np.cumsum of runge_kuttas... but for each runge_kutta I still need a previous value of U.

Solution

For code like this, Numba is a great way to speed it up. Here's a quick guide: https://numba.pydata.org/numba-doc/latest/user/5minguide.html

You can basically keep your code as-is, add a decorator (or wrapper function) and get a pretty good speedup, sometimes comparable to what you'd get with NumPy vectorized code.