Search code examples
cgenericsgccvectormemcpy

gcc optimising away compound statement


I am having an issue with implementing a push_back operator for a generic resizable vector in c. For genericity I need to use a void pointer as an argument, but in practice I want to give it values directly.

When I compile the below code with gcc -o t test.c -std=c99, it prints 10 as I expect. When I add -O1 (or higher) to the compile options, the program prints 0.

I think the problem is in the smemcpy code, as when I replace it with memcpy I no longer have this problem.

Simplified code:

#include <stdio.h>
#include <stdlib.h>

#define get_pointer(value) ({ __typeof__(value) tmp = value; &tmp; })

// copy from src to dst byte by byte
void* smemcpy(void* dst, void const * src, size_t len) {
    char * pdst = (char *) dst;
    char const * psrc = (char const *) src;

    while (len--) {
        *pdst++ = *psrc++;
    }

    return (dst);
}


int main() {
    void* container = malloc(sizeof(int));

    // copy a 10 into the container via a temporary pointer
    smemcpy(container, get_pointer(10), sizeof(int));

    printf("%d\n", ((int*)container)[0]);

    return 0;
}

Thanks in advance for any help,

B


Solution

  • The definition of get_pointer uses a statement in an expression, which is a GCC extension. The semantics of this are barely documented, and there is no reason to believe the storage duration of an object declared in a statement-expression persists beyond the evaluation of the statement.

    Thus, in preparing the call to smemcpy, the compiler may evaluate get_pointer by creating the object tmp, producing its address as the value of the statement-expression, and destroying the object tmp. Then the now-invalid address of the no-longer-existing object is passed to smemcpy, which copies invalid data because the space used for tmp has been reused for another purpose.

    The code may work when memcpy is used because memcpy is a special function known to GCC, and GCC optimizes it in various special ways.

    A compound literal should work; the C standard specifies that a compound literal within the body of a function has automatic storage duration associated with the enclosing block. If we define get_pointer as follows, the enclosing block includes the entire smemcpy call:

    #define get_pointer(value) (& (__typeof__(value)) { value })