Search code examples
c++liststdlist

Fast search and delete in a std::list of objects


I have a very large list of objects (nodes), and I want to be able to remove/delete elements of the list based on a set of values inside of them. Preferably in constant time...

The objects (among other things) has values like:

long long int nodeID;
int depth;
int numberOfClusters;
double [] points;
double [][] clusters;

What I need to do is to look through the list, and check if there are any elements that has the same values in all fields except for nodeID.

Right now I'm doing something like this:

for(i = nodes.begin(); i != nodes.end(); i++)
{
    for(j = nodes.begin(); j != nodes.end(); j++)
    {
        if(i != j)
        {
            if(compareNodes((*i), (*j)))
            {
                j = nodes.erase (j);
            }
        }
    }
}

Where compareNodes() compares the values inside the two nodes. But this is wildly inefficient.

I'm using erasebecause that seems to be the only way to delete an element in the middle of a std::list.

Optimally, I would like to be able to find an element based on these values, and remove it from the list if it exists.

I am thinking some sort of hash map to find the element (a pointer to the element) in constant time, but even if I can do that, I can't find a way to remove the element without iterating through the list. It seemes that I have to use erase , but that requires iterating through the list, which means linear complexity in the list size. There is also remove_if but again, same problem linear complexity in list size.

Is there no way to get remove an element from a std::list without iterating through the whole list?


Solution

  • First off, you can speed up your existing solution by starting j at std::next(i) instead of nodes.begin() (assuming your compareNodes function is commutative).

    Second, the hashmap approach sounds viable. But why keep a pointer to the element as a value in the map, when you can keep an iterator? They're both "a thing which references the element," but you can use the iterator to erase the element. And std::list iterators don't invalidate when the list is modified (they're most probably just pointers under the hood).

    Thirdly, if you want to encapsulate/automate the lookup & sequential access, you can look into Boost.Multi-index to build a container with both sequential and hashed access.