Search code examples
c++g++functorstl-algorithm

Are Standard Library algorithms allowed to copy predicate arguments?


Suppose we'd like to remove duplicate values from a vector of ints. The usual solution is to sort the vector and erase duplicates with erase-remove idiom. But we need to mantain the order of the elements that will not be removed, so we can't sort. So one might come up with a predicate like this and use with with remove_if algorithm:

struct comp {
    std::set<int> s;
    comp() : s() {}
    bool operator()(int i)
    {
        return !(s.insert(i)).second;
    }
};

But this will break if predicate object will be copied for some reason, since we'll get two copies of the set member. And indeed, the gcc's implementation of remove_if does exactly that:

template<typename _ForwardIterator, typename _Predicate>
    _ForwardIterator
    remove_if(_ForwardIterator __first, _ForwardIterator __last,
          _Predicate __pred)
    {

      __first = _GLIBCXX_STD_A::find_if(__first, __last, __pred);

      if(__first == __last)                             // ^^^^^ here a copy is made
        return __first;
      _ForwardIterator __result = __first;
      ++__first;
      for(; __first != __last; ++__first)
        if(!bool(__pred(*__first)))
          {
            *__result = _GLIBCXX_MOVE(*__first);
            ++__result;
          }
      return __result;
    }

The workaround is to make set member of our functor static:

struct comp {
    static set<int> s;
    comp() { s. clear(); }
    bool operator()(int i)
    {
        return !(s.insert(i)).second;
    }
};
set<int> comp::s;

But the question remains:

Do we need to make sure a possible copy of predicate functor will not break our logic? Is there anything in the standard that mandates (or prohibits) certain behaviour with regard to this issue? Or is it a bug in the implementation?


Solution

  • Yes, the standard does not specify how many times the predicate will be copied, nor does it say in what order the predicate will be applied to elements of the container. Essentially, predicates must act like pure functions; they must have no observable state.1

    So remove_if does not sound like an appropriate algorithm here. Hacks such as storing the set externally to the functor will not solve the problem; you'll still be invoking undefined behaviour.


    1. For a more in-depth discussion, see Item 39 ("Make predicates pure functions") of Scott Meyers' Effective STL.