Search code examples
adareference-manual

Ada 2012 RM - Comments and String Literals


I am journeying through the Ada 2012 RM and would like to see if there is a hole in my understanding or a hole in the RM. Assuming that

    put_line ("-- this is a not a comment");

is legal code, how can I deduce its legality from the RM, since section 2.7 states that "a comment starts with two adjacent hyphens and extends up to the end of the line.", while section 2.6 states "a string_literal is formed by a sequence of graphic characters (possibly none) enclosed between two quotation marks used as string brackets." It seems like there is tension between the two sections and that 2.7 would win, but that is apparently not the case.


Solution

  • To get a clearer understanding here, you need to have a look at section 2.2 in the RM.

    2.2 (1), which states;

    The text of each compilation is a sequence of separate lexical elements. Each lexical element is formed from a sequence of characters, and is either a delimiter, an identifier, a reserved word, a numeric_literal, a character_literal, a string_literal, or a comment. The meaning of a program depends only on the particular sequences of lexical elements that form its compilations, excluding comments.

    And 2.2 (3/2) which states:

    "[In some cases an explicit separator is required to separate adjacent lexical elements.] A separator is any of a separator_space space character, a format_effector format effector, or the end of a line, as follows:

    A separator_space space character is a separator except within a comment, a string_literal, or a character_literal.

    The character whose code point position is 16#09# (CHARACTER TABULATION) Character tabulation (HT) is a separator except within a comment.

    The end of a line is always a separator.

    One or more separators are allowed between any two adjacent lexical elements, before the first of each compilation, or after the last."

    and

    A delimiter is either one of the following special characters:

    &    '    (    )    *    +    ,    –    .    /    :    ;    <    =    >    |
    

    or one of the following compound delimiters each composed of two adjacent special characters

    =>    ..    **    :=    /=    >=    <=    <<    >>    <>
    

    Each of the special characters listed for single character delimiters is a single delimiter except if this character is used as a character of a compound delimiter, or as a character of a comment, string_literal, character_literal, or numeric_literal.

    So, once you filter out the white-space of a program text and break it down into a sequence of lexical elements, a lexical element corresponding to a string literal begins with a double quote character, and a lexical element corresponding to a comment begins with --.

    These are clearly different syntax items, and do not conflict with each other.

    This also explains why;

    X := A - -1
          + B;
    

    gives a different result than;

    X := A --1
           + B;
    

    The space separator between the dashes makes the first minus a different lexical element than the -1, so -1 is a numeric literal in the first case, while the --1 is a comment.