It's well known that std::vector<bool>
does not satisfy the Standard's container requirements, mainly because the packed representation prevents T* x = &v[i]
from returning a pointer to a bool.
My question is: can this be remedied/mitigated when the reference_proxy overloads the address-of operator&
to return a pointer_proxy?
The pointer-proxy could contain the same data as the reference_proxy in most implementations, namely a pointer into the packed data and a mask to isolate the particular bit inside the block pointed to. Indirection of the pointer_proxy would then yield the reference_proxy. Essentially both proxies are "fat" pointers, which are, however, still rather light-weight compared to disk-based proxy containers.
Instead of T* x = &v[0]
one could then do auto x = &v[0]
, and use x
like if(*x)
without problems. I would also like to be able to write for(auto b: v) { /* ... */ }
Questions: would such a multi-proxy approach work with the STL's algorithms? Or do some algorithms really rely on the requirement that x
needs to be a real bool*
? Or are there too many consecutive user-defined conversions required that prevent this to work? I'd like to know any of such obstructions before trying to fully complete the above implementation sketch.
UPDATE (based on @HowardHinnant 's answer and this ancient discussion on comp.std.c++)
You can come a long way to almost mimic the builtin types: for any given type T, a pair of proxies (e.g. reference_proxy and iterator_proxy) can be made mutually consistent in the sense that reference_proxy::operator&() and iterator_proxy::operator*() are each other's inverse.
However, at some point one needs to map the proxy objects back to behave like T* or T&. For iterator proxies, one can overload operator->() and access the template T's interface without reimplementing all the functionality. However, for reference proxies, you would need to overload operator.(), and that is not allowed in current C++ (although Sebastian Redl presented such a proposal on BoostCon 2013). You can make a verbose work-around like a .get() member inside the reference proxy, or implement all of T's interface inside the reference (this is what is done for vector::bit_reference), but this will either lose the builtin syntax or introduce user-defined conversions that do not have builtin semantics for type conversions (you can have at most one user-defined conversion per argument).
My question is: can this be remedied/mitigated when the reference_proxy overloads the address-of operator& to return a pointer_proxy?
libc++ actually does this.
#include <vector>
#include <cassert>
int main()
{
std::vector<bool> v(1);
std::vector<bool>::pointer pb = &v[0];
assert(*pb == false);
*pb = true;
assert(v[0] == true);
std::vector<bool>::const_pointer cbp = pb;
assert(*cbp == true);
v[0] = false;
assert(*cbp == false);
}
It even extends to const_pointer
and const_reference
in ways that mimic the same types for vector<int>
. This is a non-conforming extension for libc++. But it makes writing generic code which might be instantiated on vector<bool>
far more likely to compile and behave correctly.
Questions: would such a multi-proxy approach work with the STL's algorithms? Or do some algorithms really rely on the requirement that x needs to be a real bool*? Or are there too many consecutive user-defined conversions required that prevent this to work?
All of libc++'s algorithms work with vector<bool>
. Some of them with quite spectacular performance. One algorithm in particular must have special treatment which the standard unfortunately does not mandate:
#include <vector>
#include <cassert>
int main()
{
std::vector<bool> v(1);
bool b = true;
assert(v[0] == false);
assert(b == true);
std::swap(b, v[0]);
assert(v[0] == true);
assert(b == false);
}
This is very easy for the implementation to accomplish. One simply needs to make sure swap
works for any combination of bool
and vector<bool>::reference
. But I don't know if any implementation besides libc++ does this, and it is not mandated by C++11.
An array of bits is a wonderful data structure. But unfortunately it is poorly specified in the C++ standard. libc++ has gone somewhat outlaw to demonstrate that this can be a very useful and high performance data structure. The hope is that a future C++ standard may migrate in this direction to the benefit of the C++ programmer.