I'm looking for a GPU CuPy counterpart of numpy.unique() with axis option supported.
I have a Cupy 2D array that I need to remove its duplicated rows. Unfortunately, cupy.unique() function flattens the array and returns 1D array with unique values. I'm looking for a function like numpy.unique(arr, axis=0) to solve this but CuPy does not support the (axis) option yet
x = cp.array([[1,2,3,4], [4,5,6,7], [1,2,3,4], [10,11,12,13]])
y = cp.unique(x)
y_r = np.unique(cp.asnumpy(x), axis=0)
print('The 2D array:\n', x)
print('Required:\n', y_r, 'But using CuPy')
print('The flattened unique array:\n', y)
print('Error producing line:', cp.unique(x, axis=0))
I expect a 2D array with unique rows but I get a 1D array with unique numbers instead. Any ideas about how to implement this with CuPy or numba?
As of CuPy version 8.0.0b2
, the function cupy.lexsort
is correctly implemented. This function can be used as a workaround (albeit probably not the most efficient) for cupy.unique
with the axis argument.
Assuming the array is 2D, and that you want to find the unique elements along axis 0 (else transpose/swap as appropriate):
###################################
# replacement for numpy.unique with option axis=0
###################################
def cupy_unique_axis0(array):
if len(array.shape) != 2:
raise ValueError("Input array must be 2D.")
sortarr = array[cupy.lexsort(array.T[::-1])]
mask = cupy.empty(array.shape[0], dtype=cupy.bool_)
mask[0] = True
mask[1:] = cupy.any(sortarr[1:] != sortarr[:-1], axis=1)
return sortarr[mask]
Check the original cupy.unique
source code (which this is based on) if you want to implement the return_stuff arguments, too. I don't need those myself.