Search code examples
c++linuxwindowsundefined-behavior

You can `memset()` over a string in windows?


I'm working on porting some to linux, and I discovered a rather interesting bug in the code that apparently works on windows, but not linux. A class with some string members was being initialized via memset(this), which apparently works on windows, but throws a segmentation fault on linux in the string destructor.

Yes, I know using memset() for this is horrible practice, and I'm fixing it.

SSCCE:

#include <iostream>
#include <cstring>

int main()
{
    std::string tmp;
    std::cout << "String instantiated" << std::endl;
    memset(&tmp, 0, sizeof(tmp));
    std::cout << "String memset" << std::endl;

    return 0;
}

This runs fine on windows, but the string destructor segfaults on linux.

Compilers:

  • MSVC++ 2013 (Microsoft (R) C/C++ Optimizing Compiler Version 18.00.31101 for x64)
  • g++ (Ubuntu 4.8.2-19ubuntu1) 4.8.2

I understand that this was (and is) horrible practice either way, but how did it ever work in the first place?


Solution

  • If you want to dig into implementation details, MSVC and Clang (with libc++) use string with short-string optimization, which looks roughly like this:

    class string {
        size_t length;
        char* ptr;
        char short_buf[N];
    };
    

    So if it's memset to 0, its destructor will think that its length is zero and probably will do nothing, and also even if it attempts to delete[] ptr, it won't crash because delete works fine with null pointers.

    GCC, on the opposite, until very recent time used quite different string impementation which involved copy on write and reference-counting. So its internal structure is much more complicated and it's no surprise it crashes after memset.