Search code examples
c++stdvectorerase

Find all elements with the same value in a vector, then erase all of them out of the vector


The problem is that I need to find all elements with the same value in a vector, do something with them, then erase all of them out of the vector. Keep doing that until the vector is empty.

vector<unsigned> L;
vector<unsigned>::iterator it, it2, it3;
vector<unsigned> vec;
unsigned Z;

// populate the vector (1, 2, 3, 4, 2, 4)
for(unsigned i = 1; i <= 4; ++i)
   L.push_back(i);
for(unsigned i = 2; i <= 4; i = i + 2)
   L.push_back(i);

it = L.begin();
while(it != L.end() -1){
  cout<< "*it = " << *it << endl;
  Z=0;
  vec.clear();

  it2 = it + 1;
  cout<< "*it2 = " << *it2 << endl;

  while(it2 != L.end()){
     cout << "Loop *it2 = " << *it2 <<endl;
     if(*it == *it2){
        vec.push_back(*it);
        L.erase(it2);  // iterator automatically points to the next element
        cout<< "after erase(it2), *it2 = " << *it2 << endl;
        continue;
     }
     ++it2;
  }
// do something (here I calculate the average)
  for(it3 = vec.begin(); it3 != vec.end(); ++it3)
     Z = Z+ *it3;

  Z= Z/vec.size();

  cout<< "Z = " << Z << endl << endl;
  L.erase(it); // iterator automatically points to the next element
}

The output is:

*it = 1
*it2 = 2
Loop *it2 = 2
Loop *it2 = 3
Loop *it2 = 4
Loop *it2 = 2
Loop *it2 = 4

Then it stops working

If I populate the vector with this code

// populate the vector (1, 2, 3, 4, 1, 3)
for(unsigned i = 1; i <= 4; ++i)
   L.push_back(i);
for(unsigned i = 1; i <= 4; i = i + 2)
   L.push_back(i);

Then the output is

*it = 1
*it2 = 2
Loop *it2 = 2
Loop *it2 = 3
Loop *it2 = 4
Loop *it2 = 1
after erase(it2), *it2 = 3
Loop *it2 = 3
Z = 1

*it = 2
*it2 = 3
Loop *it2 = 3
Loop *it2 = 4
Loop *it2 = 3

It stops working here

I know there is something wrong in the second while loop but I can't figure out what it is. Any help would be appreciated.


Solution

  • If the goal is to

    1. Process duplicates and then
    2. Erase them

    there is a much easier solution to this, and that is to use std::stable_partition, along with a std::set:

    #include <algorithm>
    #include <vector>
    #include <iterator>
    #include <set>
    #include <iostream>
    #include <numeric>
    
    using namespace std;
    
    int main()
    {
        std::vector<int> L = { 1, 2, 4, 3, 2, 4 };
        std::set<int> tempSet;
        //...
        // partition the elements, unique items on left, duplicates on right
        auto divider = stable_partition(L.begin(), L.end(), [&](int n)
        {
            // return true if item is not in the set, false otherwise
            return tempSet.insert(n).second;
        });
    
        // do something with the duplicates, for example, print them
        cout << "Here are the dups:\n";
        copy(divider, L.end(), ostream_iterator<int>(cout, " "));
    
        // get the average
    
        // get number of duplicates  
        size_t numDups = std::distance(divider, L.end());  
        double avg = 0.0;
    
        // compute average using std::accumulate
        if ( numDups > 0 ) 
           avg = std::accumulate(divider, L.end(), 0.0) / numDups;
        cout << "\nThe average of the duplicates is: " << avg << "\n";
    
        // erase the duplicates
        L.erase(divider, L.end());
    
        // print the updated vector now
        cout << "\n\nHere is the resulting vector:\n";
        copy(L.begin(), L.end(), ostream_iterator<int>(cout, " "));
    }
    

    Live Example, C++ 14

    Here is the same code, but using C++ 03 (for those without C++11 / 14):

    Live Example, C++ 03

    Note there are no loops in the code above. Everything is done by an algorithm function. Partitioning, average computation, erasing, etc., are all performed with no loops.

    Basically we test each item to see if the item exists in the set. If it does, then it will go to the right of the partition, if not, then the item goes to the left of the partition.

    The return value divider is an iterator to the element that is the "dividing line" in the partition. Once the items are partitioned, then we can process them easily by using divider to tell us where the "good" items are and where the "about to be erased" items are.

    BTW, this worked the first time I compiled it successfully -- the one major reason for this "luck" in getting it to work quickly is that the algorithm functions just plain work when given the correct parameters (and if the predicate function is written correctly). Erasing items, moving items, etc. in a container, especially a sequence container such as vector, is almost always covered by usage of 1, 2, or 3 algorithm functions.