I am trying to remove points from a point cloud that are too close to each other. My input is an mx3 matrix where the columns represent xyz coordinates. Code is as follows:
def remove_duplicates(points, threshold):
# Convert to numpy
points = np.array(points)
# Round to within the threshold
rounded_points = points
if threshold > 0.0:
rounded_points = np.round(points/threshold)*threshold
# Remove duplicate points
point_tuples = [tuple(point) for point in rounded_points]
unique_rounded_points, unique_indices = np.unique(point_tuples, return_index = True)
points = points[unique_indices]
return points
The issue I am running into is that unique_indices contains values larger than the length of points (2265 and 1000 for my test data). Am I doing something wrong, or is this a bug in NumPy?
Edit: I should note that for very small inputs (tried 27 points), unique() appears to work correctly.
So points
is a 2d array, (m,3)
in shape, right?
point_tuples
is a list of tuples, i.e. row of rounded_points
is now a tuple of 3 floats.
np.unique
is going to turn that into an array to do it's thing
np.array(point_tuples)
is a (m,3)
array (again 2d like points
). The tuple did nothing.
unique
will act on the raveled form of this array, so unique_indices
could have values between 0 and 3*m. Hence your error.
I see 2 problems - if you want unique
to find unique 'rows', you need to make a structured array
np.array(point_tuples, 'f,f,f')
Also applying unique
to floats is tricky. It's next to impossible to find 2 floats that are equal. Rounding reduces this problem but does not eliminate it.
So it probably is better to use round in such a way that rounded_points
is an array of integers. The values don't need to scaled back to match points
.
I can add an example if needed, but first try these suggestions. I'm making a lot of guesses about your data, and I'd like to get some feedback before going further.