Suppose we'd like to remove duplicate values from a vector of int
s. The usual solution is to sort the vector and erase duplicates with erase-remove idiom. But we need to mantain the order of the elements that will not be removed, so we can't sort. So one might come up with a predicate like this and use with with remove_if
algorithm:
struct comp {
std::set<int> s;
comp() : s() {}
bool operator()(int i)
{
return !(s.insert(i)).second;
}
};
But this will break if predicate object will be copied for some reason, since we'll get two copies of the set
member. And indeed, the gcc's implementation of remove_if
does exactly that:
template<typename _ForwardIterator, typename _Predicate>
_ForwardIterator
remove_if(_ForwardIterator __first, _ForwardIterator __last,
_Predicate __pred)
{
__first = _GLIBCXX_STD_A::find_if(__first, __last, __pred);
if(__first == __last) // ^^^^^ here a copy is made
return __first;
_ForwardIterator __result = __first;
++__first;
for(; __first != __last; ++__first)
if(!bool(__pred(*__first)))
{
*__result = _GLIBCXX_MOVE(*__first);
++__result;
}
return __result;
}
The workaround is to make set
member of our functor static:
struct comp {
static set<int> s;
comp() { s. clear(); }
bool operator()(int i)
{
return !(s.insert(i)).second;
}
};
set<int> comp::s;
But the question remains:
Do we need to make sure a possible copy of predicate functor will not break our logic? Is there anything in the standard that mandates (or prohibits) certain behaviour with regard to this issue? Or is it a bug in the implementation?
Yes, the standard does not specify how many times the predicate will be copied, nor does it say in what order the predicate will be applied to elements of the container. Essentially, predicates must act like pure functions; they must have no observable state.1
So remove_if
does not sound like an appropriate algorithm here. Hacks such as storing the set
externally to the functor will not solve the problem; you'll still be invoking undefined behaviour.