I'm working on porting some to linux, and I discovered a rather interesting bug in the code that apparently works on windows, but not linux. A class with some string members was being initialized via memset(this)
, which apparently works on windows, but throws a segmentation fault on linux in the string destructor.
Yes, I know using memset()
for this is horrible practice, and I'm fixing it.
SSCCE:
#include <iostream>
#include <cstring>
int main()
{
std::string tmp;
std::cout << "String instantiated" << std::endl;
memset(&tmp, 0, sizeof(tmp));
std::cout << "String memset" << std::endl;
return 0;
}
This runs fine on windows, but the string destructor segfaults on linux.
Compilers:
I understand that this was (and is) horrible practice either way, but how did it ever work in the first place?
If you want to dig into implementation details, MSVC and Clang (with libc++) use string
with short-string optimization, which looks roughly like this:
class string {
size_t length;
char* ptr;
char short_buf[N];
};
So if it's memset
to 0
, its destructor will think that its length is zero and probably will do nothing, and also even if it attempts to delete[] ptr
, it won't crash because delete
works fine with null pointers.
GCC, on the opposite, until very recent time used quite different string
impementation which involved copy on write and reference-counting. So its internal structure is much more complicated and it's no surprise it crashes after memset
.