Assuming we have a two dimensional array like the following:
array1 = np.array([[1,4,3, 64356,5435,434],
[11,46,3, 7356,585,74],
[51,406,3, 769,5435,24],
[12,45,5, 656,135,134],
[112,475,5, 656,1385,134],
[13,46, 5, 656,1385,19]])
the row 4 and 5 are not unique in terms or their 2,3,4 columns , for which we want to drop one of them. Is there an efficient way to drop rows of an array and make its rows unique in terms of selected columns of it?
A solution in pure numpy:
_, idx = np.unique(array1[:,[2,3,4]], axis=0, return_index=True)
array1[sorted(idx)]
Output:
array([[ 1, 4, 3, 64356, 5435, 434],
[ 11, 46, 3, 7356, 585, 74],
[ 51, 406, 3, 769, 5435, 24],
[ 12, 45, 5, 656, 135, 134],
[ 112, 475, 5, 656, 1385, 134]])