I'm writing a websocket server and I have to deal with masked data that I need to unmask.
The mask is unsigned char[4], and the data is a unsigned char* buffer as well.
I don't want to XOR byte by byte, I'd much rather XOR 4-bytes at a time.
uint32_t * const end = reinterpret_cast<uint32_t *>(data_+length);
for(uint32_t *i = reinterpret_cast<uint32_t *>(data_); i != end; ++i) {
*i ^= mask_;
}
Is there anything wrong with the use of reinterpret_cast in this situation?
The alternative would be the following code which isn't as clear and not as fast:
uint64_t j = 0;
uint8_t *end = data_+length;
for(uint8_t *i = data_; i != end; ++i,++j) {
*i ^= mask_[j % 4];
}
I'm all ears for alternatives, including ones dependent on c++11 features.
The are a few potential problems with the approach posted:
char
needs to be aligned properly to be accessible. A typical requirement for uint32_t
is that the object is aligned to an address divisible by four.length / sizeof(uint32_t) != 0
the loop may never terminate.mask
needs to contain different values. If mask
is produced by *reinterpret_cast<uint32_t>(char_mask)
of a suitable array this shouldn't be an array.If these issues are taken care of, reinterpret_cast<...>(...)
can be used in the situation you have. Reinterpreting the meaning of pointers is one of the reasons this operation is there and sometimes it is needed. I would create a suitable test case to verify that it works properly, though, to avoid having to hunt down problems when porting the code to a different platform.
Personally I would go with a different approach until profiling shows that it is too slow:
char* it(data);
if (4 < length) {
for (char* end(data + length - 4); it < end; it += 4) {
it[0] ^= mask_[0];
it[1] ^= mask_[1];
it[2] ^= mask_[2];
it[3] ^= mask_[3];
}
}
it != data + length && *it++ ^= mask_[0];
it != data + length && *it++ ^= mask_[1];
it != data + length && *it++ ^= mask_[2];
it != data + length && *it++ ^= mask_[3];
I'm definitely using a number of similar approaches in software which meant to be really faster and haven't found them to be a notable performance problem.