I want to use fixed contiguous bytes of a long byte array s
as keys in a std::map<std::array<char,N>,int>
.
Can I do this without copying by reinterpreting subarrays of s
as std::array<char,N>
?
Here is a minimal example:
#include <map>
int main() {
std::map<std::array<char,10>,int> m;
const char* s="Some long contiguous data";
// reinterpret some contiguous 10 bytes of s as std::array<char,10>
// Is this UB or valid?
const std::array<char,10>& key=*reinterpret_cast<const std::array<char,10>*>(s+5);
m[key]=1;
}
I would say yes, because char
is a POD type that does not require alignment to specific addresses (in contrast to bigger POD types, see https://stackoverflow.com/a/32590117/6212870). Therefore, it should be OK to reinterpret_cast
to std::array<char,N>
starting at every address as long as the covered bytes are still a subrange of s
, i.e. as long as I ensure that I do not have buffer overflow.
Can I really do such reinterpret_cast
or is it UB?
EDIT:
In the comments, people correctly pointed to the fact that I cannot know for sure that for std::array<char,10> arr
it holds that (void*)&arr==(void*)&arr[0]
due to the possibility of padding of the internal c-array data member of the std::array
template class, even though this typically should not be the case, especially since we are considering a char
POD array. So I update my question:
Can I rely on the reinterpret_cast
as done above when I check via static_assert
that indeed there is no padding? Of coures the code won't compile anymore on compiler/platform combinations where there is padding, so I won't use this method. But I want to know: Are there other concerns apart from the padding? Or is the code valid with a static_assert
check?
No—there is no object of type std::array<char,10>
at that address, regardless of the layout of that type. (The special rules for char
do not apply to a type that happens to have char
subobjects.) As always, it is not the reinterpret_cast
itself whose behavior is undefined, but rather the access through that non-object when using it as a map
key. (What you are allowed to do in this case is merely cast it back to the real type, for use with C-like interfaces that require a fixed pointer type but do not actually use the object.)
This access also of course involves a copy; if your goal was to avoid copying at all, just make a
std::map<const char*,int,ten_cmp>
where ten_cmp
is a functor type that compares 10 bytes starting from each address (via std::strncmp
or std::string_view
).
If you do want the map
to own its key data, just std::memcpy
from the string into a key; compilers often recognize that such temporary “buffers” don’t need to exist independently and actually read from the source in the fashion you hope to do with reinterpret_cast
.