Search code examples
c++sortingvectorerase

vector sort and erase won't work


When using this code to remove duplicates I get invalid operands to binary expression errors. I think that this is down to using a vector of a struct but I am not sure I have Googled my question and I get this code over and over again which suggests that this code is right but it isn't working for me.

std::sort(vec.begin(), vec.end());
vec.erase(std::unique(vec.begin(), vec.end()), vec.end());

Any help will be appreciated.

EDIT:

fileSize = textFile.size();
vector<wordFrequency> words (fileSize);
int index = 0;
for(int i = 0; i <= fileSize - 1; i++)
{
    for(int j = 0; j < fileSize - 1; j++)
    {
        if(string::npos != textFile[i].find(textFile[j]))
        {
            words[i].Word = textFile[i];
            words[i].Times = index++;
        }
    }
    index = 0;
}

sort(words.begin(), words.end());
words.erase(unique(words.begin(), words.end(), words.end()));

Solution

  • First problem.

    unique used wrongly

    unique(words.begin(), words.end(), words.end()));
    

    You are calling the three operand form of unique, which takes a start, an end, and a predicate. The compiler will pass words.end() as the predicate, and the function expects that to be your comparison functor. Obviously, it isn't one, and you enter the happy world of C++ error messages.

    Second problem.

    either use the predicate form or define an ordering

    See the definitions of sort and unique.

    You can either provide a

    bool operator< (wordFrequency const &lhs, wordFrequency const &rhs)
    {
        return lhs.val_ < rhs.val_;
    }
    

    , but only do this if a less-than operation makes sense for that type, i.e. if there is a natural ordering, and if it's not just arbitrary (maybe you want other sort orders in the future?).

    In the general case, use the predicate forms for sorting:

    auto pred = [](wordFrequency const &lhs, wordFrequency const &rhs)
    {
        return lhs.foo < rhs.foo;
    };
    
    sort (words.begin(), words.end(), pred);
    words.erase (unique (words.begin(), words.end(), pred));
    

    If you can't C++11, write a functor:

    struct FreqAscending { // should make it adaptible with std::binary_function
        bool operator() (wordFrequency const &lhs, wordFrequency const &rhs) const
        { ... };
    };
    

    I guess in your case ("frequency of words"), operator<makes sense.

    Also note vector::erase: This will remove the element indicated by the passed iterator. But, see also std::unique, unique returns an iterator to the new end of the range, and I am not sure if you really want to remove the new end of the range. Is this what you mean?

    words.erase (words.begin(),
                 unique (words.begin(), words.end(), pred));
    

    Third problem.

    If you only need top ten, don't sort

    C++ comes with different sorting algorithms (based on this). For top 10, you can use:

    This wastes less watts on your CPU, will contribute to overall desktop performance, and your laptop batteries last longer so can do even more sorts.