Search code examples
pythonarraysdistributionpost-processing

Python: slice array uniformly with respect to dataset


I have a data set that has time t and a data d. Unfortunately, I changed the rate of exporting the data after some time (the rate was too high initially). I would like to sample the data so that I effectively remove the high-frequency exported data but maintain the low-frequency exported data near the end.

Consider the following code:

arr = np.loadtxt(file_name,skiprows=3)

Where t = arr[:,0], d = arr[:,1].

Here is a function to get a uniform slicing:

def get_uniform_slices(arr, N_desired_points):
    s = arr.shape
    if s[0] > N_desired_points: 
        n_skip = m.ceil(s[0]/N_desired_points)
    else:                     
        n_skip = 1
    return arr[0::n_skip,:] # Sample output

However, the data then looks fine for the high-frequency exported data, but is too sparse for the low-frequency exported data.

Is there some way to slice such that indexes are uniformly spaced with respect to t?

Any help is greatly appreciated.

This is function I used to find the indexes, based on the accepted answer:

def get_uniform_index(t,N_desired_points):
    t_uniform = np.linspace(np.amin(t),np.amax(t),N_desired_points)
    t_desired = [nearest(t_d, t) for t_d in t_uniform]
    i = np.in1d(t, t_desired)
    return i

Solution

  • You have 2d data e.g.,

    t = np.arange(0., 100., 0.5)
    d = np.random.rand(len(t))
    

    You want to keep only particular values of data at uniformly spaced times, e.g.

    t_desired = np.arange(0., 100., 1.)
    

    Let's pick them out the data points desired at the times desired using the in1d function:

    d_pruned = d[np.in1d(t, t_desired)]
    

    Of course, you must pick the t_desired and they should match values in t. If that's a problem, you could pick approximately uniform times using e.g.,

    def nearest(x, arr):
        index = (np.abs(arr - x)).argmin()
        return arr[index]
    
    t_uniform = np.arange(0., 100., 1.)
    t_desired = [nearest(t_d, t) for t_d in t_uniform] 
    

    Here is the complete code:

    import numpy as np
    
    t = np.arange(0., 100., 0.5)
    d = np.random.rand(len(t))
    
    def nearest(x, arr):
        index = (np.abs(arr - x)).argmin()
        return arr[index]
    
    t_uniform = np.arange(0., 100., 1.)
    t_desired = [nearest(t_d, t) for t_d in t_uniform]
    
    d_pruned = d[np.in1d(t, t_desired)]