Search code examples
cflexible-array-memberstorage-duration

How can I initialize a flexible array in rodata and create a pointer to it?


In C, the code

char *c = "Hello world!";

stores Hello world!\0 in rodata and initializes c with a pointer to it. How can I do this with something other than a string?

Specifically, I am trying to define my own string type

typedef struct {
   size_t Length;
   char Data[];
} PascalString;

And then want some sort of macro so that I can say

const PascalString *c2 = PASCAL_STRING_CONSTANT("Hello world!");

And have it behave the same, in that \x0c\0\0\0Hello world! is stored in rodata and c2 is initialized with a pointer to it.

I tried using

#define PASCAL_STRING_CONSTANT(c_string_constant) \
    &((const PascalString) { \
        .Length=sizeof(c_string_constant)-1, \
        .Data=(c_string_constant), \
    })

as suggested in these questions, but it doesn't work because Data is a flexible array: I get the error error: non-static initialization of a flexible array member (with gcc, clang gives a similar error).

Is this possible in C? And if so, what would the PASCAL_STRING_CONSTANT macro look like?

To clarify

With a C string, the following code-block never stores the string on the stack:

#include <inttypes.h>
#include <stdio.h>

int main(void) {
    const char *c = "Hello world!";

    printf("test %s", c);

    return 0;
}

As we can see by looking at the assembly, line 5 compiles to just loading a pointer into a register.

I want to be able to get that same behavior with pascal strings, and using GNU extensions it is possible to. The following code also never stores the pascal-string on the stack:

#include <inttypes.h>
#include <stdio.h>

typedef struct {
   size_t Length;
   char Data[];
} PascalString;

#define PASCAL_STRING_CONSTANT(c_string_constant) ({\
        static const PascalString _tmpstr = { \
            .Length=sizeof(c_string_constant)-1, \
            .Data=c_string_constant, \
        }; \
        &_tmpstr; \
    })

int main(void) {
    const PascalString *c2 = PASCAL_STRING_CONSTANT("Hello world!");

    printf("test %.*s", c2->Length, c2->Data);

    return 0;
}

Looking at its generated assembly, line 18 is also just loading a pointer.

However, the best code I've found to do this in ANSI C produces code to copy the entire string onto the stack:

#include <inttypes.h>
#include <stdio.h>

typedef struct {
   size_t Length;
   char Data[];
} PascalString;

#define PASCAL_STRING_CONSTANT(initial_value) \
    (const PascalString *)&(const struct { \
        uint32_t Length; \
        char Data[sizeof(initial_value)]; \
    }){ \
        .Length = sizeof(initial_value)-1, \
        .Data = initial_value, \
    }

int main(void) {
    const PascalString *c2 = PASCAL_STRING_CONSTANT("Hello world!");

    printf("test %.*s", c2->Length, c2->Data);

    return 0;
}

In the generated assembly for this code, line 19 copies the entire struct onto the stack then produces a pointer to it.

I'm looking for either ANSI C code that produces the same assembly as my second example, or an explanation of why that's not possible with ANSI C.


Solution

  • This can be done with the statment-expressions GNU extension, although it is nonstandard.

    #define PASCAL_STRING_CONSTANT(c_string_constant) ({\
            static const PascalString _tmpstr = { \
                .Length=sizeof(c_string_constant)-1, \
                .Data=c_string_constant, \
            }; \
            &_tmpstr; \
        })
    

    The extension allows you to have multiple statements in a block as an expression which evaluates to the value of the last statement by enclosing the block in ({ ... }). Thus, we can declare our PascalString as a static const value, and then return a pointer to it.

    For completeness, we can also make a stack buffer if we want to modify it:

    #define PASCAL_STRING_STACKBUF(initial_value, capacity) \
        (PascalString *)&(struct { \
            uint32_t Length; \
            char Data[capacity]; \
        }){ \
            .Length = sizeof(initial_value)-1, \
            .Data = initial_value, \
        }