I want to join two XY plots that can differ in the X axis. The plots are held in numpy ndarrys and i want the join operation to be optimal, performance wise (i know how to solve this with associative arrays).
the joint operation definition is PlotAB = join_op(PlotA,PlotB)
:
PlotA =
array([[2, 5],
[3, 5],
[5, 5])
where plotA[:,0]
is the X-axis & plotA[:,1]
is the Y-axis
PlotB =
array([[1, 7],
[2, 7],
[3, 7],
[4, 7]])
where plotB[:,0]
is the X-axis & plotB[:,1]
is the Y-axis
the joined array is:
PlotsAB =
array([[1, n, 7],
[2, 5, 7],
[3, 5, 7],
[4, n, 7],
[5, 5, n]])
where
PlotAB[:,0]
is the joind X-axis (sort uniuq).
plotAB[:,1]
is the Y-axis of PlotA.
plotAB[:,2]
is the Y-axis of PlotB.
the 'n'
represent places where a value is missing - not present in this plot.
btw, i need this to compose data for a dygraphs ploter (http://dygraphs.com/gallery/#g/independent-series)
Please advise, Thanks.
Here's a solution that uses numpy.setdiff1d
to find the unique x elements in each of the input arrays and numpy.argsort
to sort the input arrays after the [x, NaN] elements have been inserted into them.
import numpy as np
def join_op(a,b):
ax = a[:,0]
bx = b[:,0]
# elements unique to b
ba_x = np.setdiff1d(bx,ax)
# elements unique to a
ab_x = np.setdiff1d(ax,bx)
ba_y = np.NaN*np.empty(ba_x.shape)
ab_y = np.NaN*np.empty(ab_x.shape)
ba = np.array((ba_x,ba_y)).T
ab = np.array((ab_x,ab_y)).T
a_new = np.concatenate((a,ba))
b_new = np.concatenate((b,ab))
a_sort_idx = np.argsort(a_new[:,0])
b_sort_idx = np.argsort(b_new[:,0])
a_new_sorted = a_new[a_sort_idx]
b_new_sorted = b_new[b_sort_idx]
b_new_sorted_y = b_new_sorted[:,1].reshape(-1,1)
return np.concatenate((a_new_sorted,b_new_sorted_y),axis=1)
a = np.array([[2,5],[3,5],[5,5]])
b = np.array([[1,7],[2,7],[3,7],[4,7]])
c = combine(a,b)
Output:
[[ 1. nan 7.]
[ 2. 5. 7.]
[ 3. 5. 7.]
[ 4. nan 7.]
[ 5. 5. nan]]