Search code examples
numpydygraphs

how to join two X-Y plots held in two ndarrys optimally


I want to join two XY plots that can differ in the X axis. The plots are held in numpy ndarrys and i want the join operation to be optimal, performance wise (i know how to solve this with associative arrays).

the joint operation definition is PlotAB = join_op(PlotA,PlotB):

PlotA = 
array([[2, 5],
       [3, 5],
       [5, 5])

where plotA[:,0] is the X-axis & plotA[:,1] is the Y-axis

PlotB = 
array([[1, 7],
       [2, 7],
       [3, 7],
       [4, 7]])

where plotB[:,0] is the X-axis & plotB[:,1] is the Y-axis

the joined array is:

PlotsAB = 
array([[1, n, 7],
       [2, 5, 7],
       [3, 5, 7],
       [4, n, 7],
       [5, 5, n]])

where PlotAB[:,0] is the joind X-axis (sort uniuq).

plotAB[:,1] is the Y-axis of PlotA.

plotAB[:,2] is the Y-axis of PlotB.

the 'n' represent places where a value is missing - not present in this plot.

btw, i need this to compose data for a dygraphs ploter (http://dygraphs.com/gallery/#g/independent-series)

Please advise, Thanks.


Solution

  • Here's a solution that uses numpy.setdiff1d to find the unique x elements in each of the input arrays and numpy.argsort to sort the input arrays after the [x, NaN] elements have been inserted into them.

    import numpy as np
    
    def join_op(a,b):
        ax = a[:,0]
        bx = b[:,0]
    
        # elements unique to b
        ba_x = np.setdiff1d(bx,ax)
    
        # elements unique to a
        ab_x = np.setdiff1d(ax,bx)
    
        ba_y = np.NaN*np.empty(ba_x.shape)
        ab_y = np.NaN*np.empty(ab_x.shape)
    
        ba = np.array((ba_x,ba_y)).T
        ab = np.array((ab_x,ab_y)).T
    
        a_new = np.concatenate((a,ba))
        b_new = np.concatenate((b,ab))
    
        a_sort_idx = np.argsort(a_new[:,0])
        b_sort_idx = np.argsort(b_new[:,0])
    
        a_new_sorted = a_new[a_sort_idx]
        b_new_sorted = b_new[b_sort_idx]
    
        b_new_sorted_y = b_new_sorted[:,1].reshape(-1,1)
    
        return np.concatenate((a_new_sorted,b_new_sorted_y),axis=1)
    
    a = np.array([[2,5],[3,5],[5,5]])
    b = np.array([[1,7],[2,7],[3,7],[4,7]])
    c = combine(a,b)
    

    Output:

    [[  1.  nan   7.]
     [  2.   5.   7.]
     [  3.   5.   7.]
     [  4.  nan   7.]
     [  5.   5.  nan]]