Search code examples
rascal

Disallowing Juxtaposing Keywords


I tried implementing SQL's DDL syntax in rascal to get this result:

CREATE EXTERNAL TABLE page_view(viewTime INT, userid BIGINT,
     page_url STRING, referrer_url STRING,
     ip STRING COMMENT 'IP Address of the User',
     country STRING COMMENT 'country of origination')
 COMMENT 'This is the staging page view table'
 ROW FORMAT DELIMITED FIELDS TERMINATED BY '\054'
 STORED AS TEXTFILE
 LOCATION '<hdfs_location>';

but instead the code below is also valid:

CREATEEXTERNALTABLE page_view(viewTime INT, userid BIGINT,
     page_url STRING, referrer_url STRING,
     ip STRING COMMENT 'IP Address of the User',
     country STRING COMMENT 'country of origination')
 COMMENT 'This is the staging page view table'
 ROW FORMAT DELIMITED FIELDS TERMINATED BY '\054'
 STORED AS TEXTFILE
 LOCATION '<hdfs_location>';

is there a way to reject this syntax?

Below is the production rule:

syntax CreateTable
  = withColumns: 'CREATE' TemporaryTable? ExternalTable? 'TABLE' IfNotExists? TableName 
                        Columns?
                        Comment?
                        PartitionedByClause?
                        ClusteredByClause?
                        RowFormatClause?
                        StorageClause? 
                        LocationClause?
                        TablePropertiesClause?
  | withQuery: 'CREATE' TemporaryTable? ExternalTable? 'TABLE' IfNotExists? TableName 
                    RowFormatClause? 
                    StorageClause?
                    CreateTableQuery
  | withLike: 'CREATE' TemporaryTable? ExternalTable? 'TABLE' IfNotExists?  TableName Like TableName 
                    TablePropertiesClause?
  ;

Solution

    • use + instead of * for the layout definition. That will make whitespace necessary everywhere. It becomes problematic with nullable nonterminals because then two spaces are required instead of one.
    • Use follow requirements or precede requirements like "SELECT" >> [\ \n]
    • Redefine the grammar: lexical CREATEEXT = "CREATE" " "+ "EXTERNAL"
    • Add a " " between the keywords in the original rule, but you have to play with the follow restrictions on layout to avoid a parser error.

    So many options... you could also reject "CREATEEXTERNALTABLE" from the identifiers:

    Id \ "CREATEEXTERNALTABLE"