I have an array of structs; the array is of size N.
I want to remove duplicates from the array; that is, do an in-place change, converting the array to have a single appearance of each struct. Additionally, I want to know the new size M (highest index in the reduced array).
The structs include primitives so it's trivial to compare them.
How can I do that efficiently in C++?
I have implemented the following operators:
bool operator==(const A &rhs1, const A &rhs2)
{
return ( ( rhs1.x== rhs2.x ) &&
( rhs1.y == rhs2.y ) );
}
bool operator<(const A &rhs1, const A &rhs2)
{
if ( rhs1.x == rhs2.x )
return ( rhs1.y < rhs2.y );
return ( rhs1.x < rhs2.x );
}
However, I get an error when running:
std::sort(array, array+ numTotalAvailable);
* array will have all elements here valid.
std::unique_copy(
array,
array+ numTotalAvailable,
back_inserter(uniqueElements));
* uniqueElements will have non-valid elements.
What is wrong here?
You could use a combination of the std::sort
and std::unique
algorithms to accomplish this:
std::sort(elems.begin(), elems.end()); // Now in sorted order.
iterator itr = std::unique(elems.begin(), elems.end()); // Duplicates overwritten
elems.erase(itr, elems.end()); // Space reclaimed
If you are working with a raw array (not, say, a std::vector
), then you can't actually reclaim the space without copying the elements over to a new range. However, if you're okay starting off with a raw array and ending up with something like a std::vector
or std::deque
, you can use unique_copy
and an iterator adapter to copy over just the unique elements:
std::sort(array, array + size); // Now in sorted order
std::vector<T> uniqueElements;
std::unique_copy(array, array + size,
back_inserter(uniqueElements)); // Append unique elements
At this point, uniqueElements
now holds all the unique elements.
Finally, to more directly address your initial question: if you want to do this in-place, you can get the answer by using the return value from unique
to determine how many elements remain:
std::sort(elems, elems + N); // Now in sorted order.
T* endpoint = std::unique(elems, elems + N);// Duplicates overwritten
ptrdiff_t M = endpoint - elems; // Find number of elements left
Hope this helps!