Search code examples
c++solarisbus-error

Bus error with allocated memory on a heap


I have Bus Error in such code:

char* mem_original;
int int_var = 987411;
mem_original = new char [250];
memcpy(&mem_original[250-sizeof(int)], &int_var, sizeof(int));
...
const unsigned char* mem_u_const = (unsigned char*)mem_original;
...
const unsigned char *location = mem_u_const + 250 - sizeof(int);

std::cout << "sizeof(int) = " << sizeof(int) << std::endl;//it's printed out as 4
std::cout << "byte 0 = " << int(*location) << std::endl;
std::cout << "byte 1 = " << int(*(location+1)) << std::endl;
std::cout << "byte 2 = " << int(*(location+2)) << std::endl;
std::cout << "byte 3 = " << int(*(location+3)) << std::endl;
int original_var = *((const int*)location);
std::cout << "original_var = " << original_var << std::endl;

That works well few times, printing out:

sizeof(int) = 4
byte 0 = 0
byte 1 = 15
byte 2 = 17
byte 3 = 19
original_var = 987411

And then it fails with:

sizeof(int) = 4
byte 0 = 0
byte 1 = 15
byte 2 = 17
byte 3 = 19
Bus Error

It's built & run on Solaris OS (C++ 5.12) Same code on Linux (gcc 4.12) & Windows (msvc-9.0) is working well.

We can see:

  1. memory was allocated on the heap by new[].
  2. memory is accessible (we can read it byte by byte)
  3. memory contains exactly what there should be, not corrupted.

So what may be reason for Bus Error? Where should I look?

UPD: If I memcpy(...) location in the end to original_var, it works. But what the problem in *((const int*)location) ?


Solution

  • This is a common issue for developers with no experience on hardware that has alignment restrictions - such as SPARC. x86 hardware is very forgiving of misaligned access, albeit with performance impacts. Other types of hardware? SIGBUS.

    This line of code:

    int original_var = *((const int*)location);
    

    invokes undefined behavior. You're taking an unsigned char * and interpreting what it points to as an int. You can't do that safely. Period. It's undefined behavior - for the very reason you're experiencing.

    You're violating the strict aliasing rule. See What is the strict aliasing rule? Put simply, you can't refer to an object of one type as another type. A char * does not and can not refer to an int.

    Oracle's Solaris Studio compilers actually provide a command-line argument that will let you get away with that on SPARC hardware - -xmemalign=1i (see https://docs.oracle.com/cd/E19205-01/819-5265/bjavc/index.html). Although to be fair to GCC, without that option, the forcing you do in your code will still SIGBUS under the Studio compiler.

    Or, as you've already noted, you can use memcpy() to copy bytes around no matter what they are - as long as you know the source object is safe to copy into the target object - yes, there are cases when that's not true.