Search code examples
cpointerscasting

how does copying const char* to char array work


I have a code where I have a buffer and I am trying to copy const string into it like this:

#include <stdio.h>

typedef struct _BIGWORD {
    unsigned char Byte[53]; //size of constant string
} BIGWORD;

int main() {
    char cBuffer[64] = { 0 };

    BIGWORD *bwBufferCast = (BIGWORD *)cBuffer;
    
    *bwBufferCast = *(BIGWORD *){ "I am trying to copy this whole text inside a buffer!" };

    printf("text: %s\n", cBuffer);

    return 0;
}

This works fine but I dont know how. Does it copy the constant string byte after byte, does it copy it all at once or is it a just coincidence that is it there and I wouldn't work in real scenarios.

I know I can create char array like: char cBuffer[64] = "this is an array!". I want to put the const string inside a buffer at demand.

I tried to add some other constant string to the code to check if it would affect the result but it didn't.


Solution

  • There seems to be a lot of confusion in comments regarding what is "undefined behavior" and what isn't, so let me go through this line by line:

    • BIGWORD* bwBufferCast = (BIGWORD*)cBuffer;
      This is valid C since all manner of wild and crazy pointer casts are allowed in C (C17 6.3.2.3 §7).

      However, this pointer conversion may in some cases invoke undefined behavior because the character array you start from may not be aligned and the struct type might have an alignment requirement. If this happens, some CPUs might instruction trap on the pointer assignment itself. On other CPUs you might get problem later when you de-reference the pointer.

    • (BIGWORD*) { "I am trying to copy this whole text inside a buffer!" };
      This is invalid C and will not compile cleanly. You create a compound literal consisting of a pointer. Initialization of that pointer (a so-called scalar) is done "as if by assignment" (C17 6.7.9 §11). Specifically, you are trying to initialize a BIGWORD* with a char*.

      We may then check the rules of valid assignment (C17 6.5.16.1) and find that for an implicit pointer conversion to be allowed during assignment, the pointers need to be compatible. They are not (C17 6.2.7), so this is a constraint violation and the compiler must issue a diagnostic.

      A program with constrain violations still resulting in a binary executable is non-conforming C and there are no guarantees by the standard for any behavior.

    • *(BIGWORD*) is fishy. Again there is the previously mentioned potential alignment issue, if the string literal was misaligned for the struct, which would be undefined behavior.

      But also fishy since you make an lvalue access of an object using a different type than the declared effective type. Normally we might call this a so-called "strict aliasing violation", What is the strict aliasing rule?. TL;DR a violation of the type system rules resulting in undefined behavior. That could result in incorrect code generation by the compiler.

      But as it happens, this very line is not a strict aliasing violation because (by luck?) you managed to fulfill one of the exceptions to that rule (C16 6.5 §7). We have a struct, an aggregate type, and there's an exception: "an aggregate or union type that includes one of the aforementioned types among its members". Where "aforementioned types" includes "a type that is the signed or unsigned type corresponding to the effective type of the object". A struct with an unsigned char[] member is the unsigned type corresponding to the effective type char[] of the object.

    • *bwBufferCast Same issues here as above. Potential misalignment, fishy but not strict aliasing violation.


    Conclusion:

    The program does not work fine because it does not compile. With a conforming compiler (gcc 13.3 -std=c17 -pedantic-errors) I get the expected diagnostic message:

    error: initialization of 'BIGWORD *' {aka 'struct _BIGWORD *'} from incompatible pointer type 'char *' [-Wincompatible-pointer-types]

    If you ran the executable anyway and it didn't trip over misalignment, then you might still get the expected results, either by (bad) luck or by a guarantee made through a compiler extension.

    Also please note that "I only got a warning" still means that the program could be invalid C, without guaranteed behavior. Check out What must a C compiler do when it finds an error?