Search code examples
pythonintersectioncurvedata-fitting

Intersection points between two functions, multiple x-axis and y-axis coordinates


I'm working on a script which reads data from a MS Excel workbook and does some plotting etc. The data read from Excel are accelerations measurements in a_x, a_y, and a_z directions and the time in s (separate numpy arrays). Accelerations are first filtered using a 5Hz low-pass filter before being plotted, see Figure 1 for a plot example, which are the permissible acceleration levels.

I need to find the amount of time the limiting curve is exceeded but the plot is related to a_z and abs(a_y) and not time. My attempted solution is to:

  1. Find the intersection points between the acceleration and the limiting curve.
  2. Find the data points (a_z and abs(a_y)) closest to the intersection point.
  3. Get the indices for data points closest to the intersection points.
  4. Use the found indices for the time array and substract the two to get the amount of time above limiting curves.

The problem is I fail a bullet 3. I've succeeded in finding intersection points between the limiting curve and my filtered data. I've also managed in to find my data points nearest the intersection points however, if I found the closest intersection point for a_z this doesn't match the index I get from abs(a_y).

Intersection points are found by:

f_l1 = LineString(np.column_stack((x, y1)))
s_l1 = LineString(np.column_stack((az3, abs(ay3))))
intersection1 = f_l1.intersection(s_l1)
az3_1,ay3_1 = LineString(intersection1).xy

So I'm using shapely.geometry imported as LineString to find the intersection points between the limiting curves (here shown for limiting curve y1) and the function s_l1(az,abs(a_y)).

To find the my data points closest to the intersection points I used the following approach:

Intersection of two graphs in Python, find the x value

The function I use to get my data point closest to the intersection point is:

def find_nearest_index(array, value):
    array = np.asarray(array)
    idx = np.abs(abs(array-value)).argmin()
    return idx

, where the array is my filtered acceleration array and the value is either the a_z or a_y value of the intersection point.

I suspect the reason for different indices depending on a_z or abs(a_y) is because my data points are "too far" from the actual intersection coordinate, i.e. I might get a a_z coordinate, which value is close to the intersection point but the corresponding abs(a_y) is way off. Similar issue is present for the abs(a_y) intersection point/data-point correlation. For upcoming measurements, I'll increase the sampling frequency but I doubt this will solve the matter completely.

I've attempted some different approaches without much luck and my latest idea is to try to use both intersection points when locating the nearest data point, so I check if the index I get from my find_nearest_index-function using a_z is the same as the index I get from using find_nearest_index-function for abs(a_y) but I don't know how/if that is possible. But maybe there's an obvious solution to my problem that I just don't see.

A correct solution for the accelerations would like the following, where the index of my data points match the intersection points: Desirable plot between intersection points These indices are then used for calculating the amount of time above the limiting curve by taking Delta_t=t[index2]-t[index1].

But instead I typically get something like, where the index found by using a_z is different from the index found using a_y) resulting in a wrong plot and therefore also wrong Delta_t: Typical plot between intersection points


Solution

  • This is my approach to re-use the idea of np.diff(). It aloows for easy segmentation and therefore, getting the desired time stamps. Small modifications would allow for recursive use and, therefore, simple application in case of the three limiting curves.

    import matplotlib.pyplot as plt
    import numpy as np
    
    ### this is somewhat one of the bounds given in the OP
    pathx = np.array([
        -1.7,
        -1.5, -0.5,
        0, 
        1.75, 5.4,
        6
    ])
    
    pathy = np.array([
        0,
        0.75, 2.25,
        2.45,
        2.2, 0.75, 0
    ])
    
    path = np.column_stack( ( pathx, pathy ) )
    
    ### this creates a random path
    def rand_path( n ):
        vy = 0.04
        vx = 0
        xl = [0]
        yl = [0]
        for i in range( n ):
            phi = (1-1.6 *np.random.random() ) * 0.1
            mx = np.array(
                [
                    [ +np.cos( phi ), np.sin( phi ) ],
                    [ -np.sin( phi ), np.cos( phi ) ]
                ]
            )
            v = np.dot( mx, np.array( [ vx, vy ] ) )
            vx = v[0]
            vy = v[1]
            x = xl[-1] + vx
            y = yl[-1] + vy
            xl = xl + [ x ]
            yl = yl + [ y ]
        return xl, np.abs( yl )
    
    ### my version to check inside or not
    def check_inside( point, path ):
        """
        check if point is inside convex boundary
        it is based on the sign of a cross product
        """
        out = 1
        for p2, p1 in zip( path[ 1: ], path[ :-1 ] ):
            q = p2 - p1
            Q = point - p1
            cross = q[0] * Q[1] - q[1] * Q[0]
            if cross > 0:
                out = 0
                break
        return out
    
    ### test data
    xl ,yl = rand_path( 900 )
    data = np.column_stack( ( xl, yl ) )
    
    ##check and use np.diff like in the other example 
    cp = np.fromiter(
        ( check_inside( p, path ) for p in data ),
        int
    )
    ip = np.argwhere( np.diff( cp ) )
    
    ### get the points
    if len( ip ):
        ip = np.concatenate( ip )
    ipout = list()
    for p in ip:
        if cp[ p ]:
            ipout.append( p + 1 )
        else:
            ipout.append( p )
            
    pnts = data[ ipout ]
    
    ### split the line segments
    innerSegments = list()
    outerSegments = list()
    ipchecklist= [0] + ipout + [ len( cp ) - 2 ]
    
    for cntr,(s,e) in enumerate(
        zip( ipchecklist[:-1], ipchecklist[1:] )
    ):
        if cntr % 2:
            ss = s
            ee = e + 1
        else:
            ss = s + 1
            ee = e
        segment = data[ ss : ee ]
        if cntr % 2:
            outerSegments.append( segment )
        else:
            innerSegments.append( segment )
    
    """
    Here would have only the points that are truly outside the border. 
    Instead of getting the line segment data we could access a corresponding
    time stamp in the same way and calculate the time outside the limit.
    Having the points all outside would result in a systematic error towards
    smaller times. We could also interpolate the crossing and the according
    time to estimate the time of the crossing. This would remove the bias 
    and improve the estimate.
    
    With more than one boundary, we could feed the outside segments in the
    same type of algorithm and repeat the procedure with a different border.
    We all is put in a function, this would be a sort of recursive procedure.
    """
    ### plot
    fig = plt.figure()
    
    ax = fig.add_subplot( 1, 1, 1)
    ax.scatter( pnts[:,0], pnts[:,1])
    
    ax.plot( path[:,0], path[:,1])
    # ~ ax.plot( xl, yl, ls='', marker='+')
    for cnt, s in enumerate( innerSegments ):
        col= "#{:02x}0000".format(
            int( 25 + 230 * cnt / len( innerSegments ) )
        )
        ax.plot( s[:,0], s[:,1], color=col )
    for cnt, s in enumerate( outerSegments ):
        col= "#0000{:02x}".format(
            int( 25 + 230 * (1 - cnt / len( outerSegments ) ) )
        )
        ax.plot( s[:,0], s[:,1], color=col )
    
    ax.grid()
    plt.show()
    
    
    

    testing segmentation