Search code examples
cfreeform

Why doesn't the compiler report a missing semicolon?


I have this simple program:

#include <stdio.h>

struct S
{
    int i;
};

void swap(struct S *a, struct S *b)
{
    struct S temp;
    temp = *a    /* Oops, missing a semicolon here... */
    *a = *b;
    *b = temp;
}

int main(void)
{
    struct S a = { 1 };
    struct S b = { 2 };

    swap(&a, &b);
}

As seen on e.g. ideone.com this gives an error:

prog.c: In function 'swap':
prog.c:12:5: error: invalid operands to binary * (have 'struct S' and 'struct S *')
     *a = *b;
     ^

Why doesn't the compiler detect the missing semicolon?


Note: This question and its answer is motivated by this question. While there are other questions similar to this, I didn't find anything mentioning the free-form capacity of the C language which is what is causing this and related errors.


Solution

  • C is a free-form language. That means you could format it in many ways and it will still be a legal program.

    For example a statement like

    a = b * c;
    

    could be written like

    a=b*c;
    

    or like

    a
    =
    b
    *
    c
    ;
    

    So when the compiler see the lines

    temp = *a
    *a = *b;
    

    it thinks it means

    temp = *a * a = *b;
    

    That is of course not a valid expression and the compiler will complain about that instead of the missing semicolon. The reason it's not valid is because a is a pointer to a structure, so *a * a is trying to multiply a structure instance (*a) with a pointer to a structure (a).

    While the compiler can't detect the missing semicolon, it also reports the totally unrelated error on the wrong line. This is important to notice because no matter how much you look at the line where the error is reported, there is no error there. Sometimes problems like this will need you to look at previous lines to see if they are okay and without errors.

    Sometimes you even have to look in another file to find the error. For example if a header file is defining a structure the last it does in the header file, and the semicolon terminating the structure is missing, then the error will not be in the header file but in the file that includes the header file.

    And sometimes it gets even worse: if you include two (or more) header files, and the first one contains an incomplete declaration, most probably the syntax error will be indicated in the second header file.


    Related to this is the concept of follow-up errors. Some errors, typically due to missing semicolons actually, are reported as multiple errors. This is why it's important to start from the top when fixing errors, as fixing the first error might make multiple errors disappear.

    This of course can lead to fixing one error at a time and frequent recompiles which can be cumbersome with large projects. Recognizing such follow-up errors is something that comes with experience though, and after seeing them a few times it's easier to dig out the real errors and fix more than one error per recompile.