Search code examples
sqlantlrantlr4ebnf

How to syntactically ignore a part of an expression in an Antlr BNF?


I would like to use Antlr to parse SQL table DDL statements. But I need only the column identifiers and column types. I do not care about any constraints and I would like to avoid to write the whole syntax especially for CHECK constraints, because it seems to me that this is almost everything of SQL.

This is an example for a constraint:

 CREATE TABLE "T" (
   "A" CHAR (1) CHECK ( "A" IN ('N', 'Y')),
   "B" CHAR (1) CHECK ( "B" IN ('N', 'Y'))
 );

And this is the part of the BNF, which is modeled after Jonathan Leffler's hyperlinked SQL BNF:

column_definition
    : ID data_type column_constraint_definition*
    ;

column_constraint_definition
    : constraint_name_definition? column_constraint constraint_characteristics?
    ;

constraint_name_definition
    : CONSTRAINT ID
    ;

column_constraint
    : NOT NULL
    | UNIQUE | PRIMARY KEY
    | references_specification
    | check_constraint_definition
    ;

references_specification
    : REFERENCES ID ( '(' ID ( ',' ID )? )?
    ;

check_constraint_definition
    : CHECK '(' boolean_value_expression ')'
    ;

My problem is how to ignore any boolean value expression without specifying its content in detail?

I would like to ignore everything between the left and right parenthesis. But there are nested parenthesis allowed. So I can not ignore everything up to the closing parenthesis. Instead I have to count the opening and closing parenthesis. How can this be expressed in the Antlr (4) BNF?


Solution

  • I would think something like this would work.

    check_constraint_definition
        : CHECK '(' boolean_value_expression ')'
        ;
    boolean_value_expression
        : (~')')+
        | '(' boolean_value_expression ')'
        ;