Search code examples
clanguage-lawyerflexible-array-memberc17

Is this C program with two struct definitions, involving a flexible array member, defined?


The fact that a struct with a flexible array member is a type with which a variable can be declared and to which sizeof can be applied leads to an unusual behavior in the following program.

file fam1.c:

#include <stdio.h>
#include <stddef.h>

struct s {
  char c;
  char t[]; };

extern struct s x;

size_t s_of_x(void);

int main(void) {
  printf("size of x: %zu\n", sizeof x);
  printf("size of x: %zu\n", s_of_x());
}

file fam2.c:

#include <stddef.h>

struct s {
  char c;
  char t[2]; };

struct s x;

size_t s_of_x(void) {
  return sizeof x;
}

This program, when compiled and run, emits a somewhat surprising output:

$ clang -std=c17 -pedantic -Wall fam1.c fam2.c
$ ./a.out 
size of x: 1
size of x: 3

Note that you can also move the “extern” to fam2.c, and that makes the program worse in terms of having unexpected behavior if x.t is accessed. To be clear, I don't know if such a variant would be less defined according to the C17 standard, but I am pretty sure that most compilers would generate object files that, when linked together, produce a dysfunctional binary.

I am unsure whether the intent of the C17 standard is make the program made of fam1.c and fam2.c undefined, but I do not see what clauses in it make it so. One might think of C17's clauses 6.2.7:1 and 6.2.7:2, but if you read them carefully, they appear to exactly allow what fam1.c and fam2.c are doing:

6.2.7 Compatible type and composite type

6.2.7:1 Two types have compatible type if their types are the same. Additional rules for determining whether two types are compatible are described in 6.7.2 for type specifiers, in 6.7.3 for type qualifiers, and in 6.7.6 for declarators.55) Moreover, two structure, union, or enumerated types declared in separate translation units are compatible if their tags and members satisfy the following requirements: If one is declared with a tag, the other shall be declared with the same tag. If both are completed anywhere within their respective translation units, then the following additional requirements apply: there shall be a one-to-one correspondence between their members such that each pair of corresponding members are declared with compatible types; if one member of the pair is declared with an alignment specifier, the other is declared with an equivalent alignment specifier; and if one member of the pair is declared with a name, the other is declared with the same name. For two structures, corresponding members shall be declared in the same order. For two structures or unions, corresponding bit-fields shall have the same widths. For two enumerations, corresponding members shall have the same values.

6.2.7:2 All declarations that refer to the same object or function shall have compatible type; otherwise, the behavior is undefined.

For reference, flexible array members are described in 6.7.2.1:18:

6.7.2.1:18 As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member. In most situations, the flexible array member is ignored. In particular, the size of the structure is as if the flexible array member were omitted except that it may have more trailing padding than the omission would imply. However, when a . (or-> ) operator has a left operand that is (a pointer to) a structure with a flexible array member and the right operand names that member, it behaves as if that member were replaced with the longest array (with the same element type) that would not make the structure larger than the object being accessed; the offset of the array shall remain that of the flexible array member, even if this would differ from that of the replacement array. If this array would have no elements, it behaves as if it had one element but the behavior is undefined if any attempt is made to access that element or to generate a pointer one past it.

Am I missing something, in 6.2.7 or elsewhere in C17, that makes fam1.c+fam2.c undefined? Or is it a defined C program according to the C17 standard, and in that case, is the variant where extern is on the non-FAM version of the struct and x.t is accessed in the same compilation unit defined for the same reason?

(This is a digression, but I think I can explain why 6.2.7:1 is written the way it is. The intent is likely to allow, say, struct s { int (*m)[]; } x; in one compilation unit and struct s { int (*m)[2]; } x; in another)


Solution

  • As noted in the question, C 2018 6.78.2.1 18 says:

    … In most situations, the flexible array member is ignored…

    We may regard this as ignoring the flexible array member except where otherwise stated or where necessity dictates it. (For the latter, I am considering alignment requirement. The standard explicitly says a structure with a flexible array member may have more trailing padding than it would without the flexible array member but omits mention of the fact that it may have a greater alignment requirement. But clearly a flexible array member may impose a greater alignment requirement, if its elements have a greater requirement than other members of the structure.)

    Since, for purposes of determining compatibility, no exception of ignoring the flexible array member is stated, we should ignore the flexible array member for purposes of determining compatibility (but not ignore its extra padding and alignment requirement). Then, applying the rule in 6.7.2.1 1, we see that the two struct s declarations in the question do not have a “one-to-one correspondence between their members,” since one has an array member at the end and the other, when we ignore the flexible array member, does not.

    Further, I would regard the lack of mention of the potential additional padding in 6.7.2.1 1 (and the lack of mention of additional alignment requirement) as evidence the committee failed to fully consider the effects of flexible array members on the statements of compatibility in 6.7.2.1 1, and 6.7.2.1 1 is therefore incomplete.

    The above attempt to wrest meaning out of words written imperfectly by humans leaves open the possibility that a structure type with a flexible array member would be deemed compatible with a structure type without the flexible array member with the same alignment requirement (and hence the same trailing padding). That may be an unintentional consequence, but may not cause any problems—the two types are deemed compatible only when declared in separate translation units and will behave the same for assignment and other actions except that the flexible array member will be accessible in one translation unit and not in the other.