Search code examples
c++socketsc++20strict-aliasingaddress-sanitizer

UNIX sockets::recv, std::byte, and strict aliasing


I'm writing a function that basically wraps recv:

ssize_t recv(int sockfd, void *buf, size_t len, int flags);

In particular, I want to write to receive some bytes; and sometimes these bytes will be part of an ASCII string, other times they will be integers, or maybe just plain "bytes" that are part of some higher-level protocol.

I thought that maybe the right way to abstract this in modern C++ would be to write to a std::byte buffer, so maybe something like this

std::vector<std::byte> buffer;
buffer.resize(100);
recv(socket, buffer.data(), 100, /* flags = */ 0);

My first question is: are there any issues with writing to a "buffer" of std::bytes as above? Should the buffer be of type std::vector<char> instead? I guess this is fine but I'm not 100% sure.

My second question is the following: say that now I want to treat buffer as a string. The code

std::string str(buffer.data(), 100);

fails because std::byte* does not convert to const char*, and I'm almost sure that

std::string str(reinterpret_cast<const char*>(buffer.data()), 100);

is undefined behavior because of the strict aliasing rule.

Is the only way to due this with something like memcpy:

std::string ret;
ret.resize(100);
std::memcpy(ret.data(), buffer.data(), 100);

?

What if I want a std::string_view? Can I make a std::string_view of buffer without actually copying the bytes to some intermediate place first? Could std::bit_cast work?

Interestingly, clang does not complain about something similar to the std::string(reinterpret_cast... solution: https://godbolt.org/z/7zshhr (even compiling with -fsanitize=address or -fsanitize=undefined)


Solution

  • There is no difference between a char and a byte, in this instance. In fact, originally these network functions were defined in terms of the char data type. A very long time ago the second parameter to recv was a char *, instead of a void * as it is now.

    This would make converting the std::vector<char> into a std::string a nothing-burger. You may even choose to forego std::vector<char> entirely. You can resize your std::string up-front, then recv() directly into it, then resize it again according to how many bytes were received.