Search code examples
cmacrosvariadic-macrosc23

Can __VA_OPT__(,) detect a trailing comma with nothing after it?


While playing with __VA_OPT__(,) I noticed the following behavior.

Background

First, note that C is fine with trailing commas in an initializer. These are equivalent:

int foo[] = { 10, 20, 30 };
int baz[] = { 10, 20, 30, };

So lets contrive an initializer macro, which also works:

#define bar(...) { __VA_ARGS__ }
int foo[] = bar(10, 20, 30);
int baz[] = bar(10, 20, 30,);

and now I want to add 40 to the list, even if the list is empty. Both of these work; foo gets 10, 20, 30, 40 and baz gets 40:

#define bar(...) { __VA_ARGS__ __VA_OPT__(,) 40 }
int foo[] = bar(10, 20, 30);
int baz[] = bar();

The problem

But what if we forget an extra comma at the end?

#define bar(...) { __VA_ARGS__ __VA_OPT__(,) 40 }
int foo[] = bar(10, 20, 30, );

Now we get this error which isn't super intuative. It never directly points at the "problem comma" after the 30:

r.c: In function ‘main’:
r.c:160:43: error: expected expression before ‘,’ token
  160 | #define bar(...) { __VA_ARGS__ __VA_OPT__(,) 40 }
      |                                           ^
r.c:164:21: note: in expansion of macro ‘bar’
  164 |         int foo[] = bar(10, 20, 30,);
      |                     ^~~

where the preprocessor doubles up the comma:

# cpp r.c|grep -v ^#|indent
...
int foo[] = { 10, 20, 30,, 40 };

Question

Is it possible to contrive a macro that prevents doubling the comma, or adjust the macro to produce a more sane error from the compiler?

More complicated example

The example above is pretty simple, but when things get complicated, and you miss a comma, then the error is horrible. Take this example from here.

(Please note: This is not an XY problem question because I really do want to know if C macros can suppress double-comma situations. Please do not suggest "fixes" to this more complex, yet still contrived, second example.)

Suppose we are statically initializing recursive tree-like structs:

struct menu
{
    char *name;
    struct menu **submenu;
};

#define MENU_LIST(...) (struct menu*[]){ __VA_ARGS__ __VA_OPT__(,) NULL }
#define MENU_ITEM(...) &(struct menu){ __VA_ARGS__ }

which provides very nice functional-looking syntax like below...but can you spot the "extra" comma in the code below, which produces this error? You might not see it if you are used to trailing commas being valid:

r.c:11:65: error: expected expression before ‘,’ token
   11 | #define MENU_LIST(...) (struct menu*[]){ __VA_ARGS__ __VA_OPT__(,) NULL }
      |                                                                 ^
r.c:113:25: note: in expansion of macro ‘MENU_LIST’
  113 | struct menu *mymenu[] = MENU_LIST(
      |                         ^~~~~~~~~
r.c:122:9: note: in expansion of macro ‘MENU_ITEM’
  122 |         MENU_ITEM(
      |         ^~~~~~~~~
r.c:124:36: note: in expansion of macro ‘MENU_LIST’
  124 |                         .submenu = MENU_LIST(
      |                                    ^~~~~~

struct menu *mymenu[] = MENU_LIST(   // line 113
    MENU_ITEM(
            .name = "planets",
            .submenu = MENU_LIST(
                MENU_ITEM( .name = "Earth" ),
                MENU_ITEM( .name = "Mars" ),
                MENU_ITEM( .name = "Jupiter" )
            )
    ),
    MENU_ITEM(                       // line 122
            .name = "stars",
            .submenu = MENU_LIST(    // line 124
                MENU_ITEM( .name = "Sun" ),
                MENU_ITEM( .name = "Vega" ),
                MENU_ITEM( .name = "Proxima Centauri" ),
            )
    ),
    MENU_ITEM(
            .name = "satellites",
            .submenu = MENU_LIST(
                MENU_ITEM( .name = "ISS" ),
                MENU_ITEM( .name = "OreSat0" )
            )
    )
);

Solution

  • @John Bollinger's answer is probably the correct one: don't do this. You'll end up with an unholy mess of macro goo which is ten times harder to maintain than just typing out all initializers hard-coded and maintain that initializer list.

    That being said, after some head aches, gcc bug encounters and truly evil macro attempts that the world will never see, I managed to boil down a reasonable work-around that's almost readable:

    • The key is to use designated initializers. They come with a handy trick of changing the "current object" used in the initializer list whenever used. So if we can initialize the array by first setting the new last item to our "sentinel value" (40, NULL or whatever) and then initialize the remaining items while starting over at zero. We don't have to care about the trailing comma, since it will end up last.

    • How do we know the index of the last item in an initializer list of unknown size? If we know the item type then this macro counts the number of items passed:

      #define COUNT_ARGS(...) ( sizeof((int[]){__VA_ARGS__}) / sizeof(int) )
      

      That macro builds up a compound literal array corresponding to the number of items, then we simply divide that by the size of the expected type. (If you want to make it type safe, you can sneak in a _Generic somewhere.)

    • Please note that an empty array initializer list {} is allowed as per C23. However if we end up with sizeof((int[]){}); because of no __VA_ARGS__, then that's a zero size array which is invalid C and an icky GNU extension. This GNU crap tripped me over because buggy gcc -std=c23 -pedantic-errors lets that slip silently although it is non-conforming C. The daily gcc bug.

      Anyways, to stick to standard C, we have to conjure a work-around:

       #define COUNT_ARGS(...) ( 0 __VA_OPT__(+ sizeof((int[]){__VA_ARGS__}) / sizeof(int)) )
      

      That is: in case there is a variable argument list, then do 0 + the sizeof trick. Otherwise if there is no variable arguments, use __VA_OPT__ to discard everything but the zero. We'll end up with a designated initializer [0] where we assign the sentinel, which is fine since that creates an array of one single object.

    • So the designated initializer we want to use to insert our sentinel value will look like this:

      [COUNT_ARGS(__VA_ARGS__)] = SENTINEL
      

      This will give an index 1 item larger than the number of initializers passed, thereby expanding the array size by 1. (Assuming the caller did write int arr[] = ... of course.) And in case of an empty argument list, the 0 + trick above means we assign item index zero, the first one.

    • We put that designated initializer first. Then the rest of the array initializer list becomes [0] = __VA_ARGS__. If we pass 1,2,3 then this expands to [0]=1,2,3. With the handy thing of [0] setting the "current object" of the initializer list to index zero. Initializing that one to 1 in this example. And the C language then states that the current object gets incremented from there on, for the remaining items in the list. So 2 to the second item at index [1] and so on. And if there's trailing comma there, nobody cares.

    • To handle that situation with an empty initializer list, we can simply put the whole [0] = __VA_ARGS__ trick inside the __VA_OPT__. That way in case the array has size zero, only the sentinel gets added.

    Complete macro:

    #define SENTINEL 40
    #define COUNT_ARGS(...) ( 0 __VA_OPT__(+ sizeof((int[]){__VA_ARGS__}) / sizeof(int)) )
    #define bar(...) { [COUNT_ARGS(__VA_ARGS__)] = SENTINEL, __VA_OPT__([0] = __VA_ARGS__) }
    

    Self-contained example with tests:

    #include <stdio.h>
    
    #define SENTINEL 40
    #define COUNT_ARGS(...) ( 0 __VA_OPT__(+ sizeof((int[]){__VA_ARGS__}) / sizeof(int)) )
    #define bar(...) { [COUNT_ARGS(__VA_ARGS__)] = SENTINEL, __VA_OPT__([0] = __VA_ARGS__) }
    
    #define TEST(arr) for(size_t i=0; i<sizeof(arr)/sizeof*arr; i++) printf("%d ", arr[i]); puts("");
    int main (void)
    {
      int foo1[] = bar(10, 20, 30);
      int foo2[] = bar(10, 20, 30,);
      int foo3[] = bar();
      int foo4[] = bar(10, 20, 30, 50, 60);
    
      TEST(foo1);
      TEST(foo2);
      TEST(foo3);
      TEST(foo4);
    }
    

    Output:

    10 20 30 40 
    10 20 30 40 
    40 
    10 20 30 50 60 40 
    

    And to answer the question "can __VA_OPT__(,) detect a trailing comma with nothing after it?" No it can't, but it's a handy trick to use when allowing an empty initializer list. The actual fix to trailing comma here was the designated initializers.