Search code examples
cpostgresqlapache-ageopencypher

How to differentiate a Cypher clause from an SQL clause in C?


I am working on adding support for Cypher clauses on Postgres psql. So far, we have added if clauses with string comparison to separate Cypher clauses from SQL clauses, with one parser for each. The HandleCypherCmds() function calls the Cypher parser, and the SendQuery() function calls the SQL parser.

/* handle cypher match command */
        if (pg_strncasecmp(query_buf->data, "MATCH", 5) == 0 ||
                pg_strncasecmp(query_buf->data, "OPTIONAL", 8) == 0 ||
                pg_strncasecmp(query_buf->data, "EXPLAIN", 7) == 0 ||
                pg_strncasecmp(query_buf->data, "CREATE", 6) == 0)
        {
            cypherCmdStatus = HandleCypherCmds(scan_state,
                                cond_stack,
                                query_buf,
                                previous_buf);

            success = cypherCmdStatus != PSQL_CMD_ERROR;

            if (cypherCmdStatus == PSQL_CMD_SEND)
            {
                success = SendQuery(convert_to_psql_command(query_buf->data));
            }
        }
        else
            success = SendQuery(query_buf->data);

The problem with this approach is that, for example, CREATE could be a SQL clause or a Cypher clause. Also, if the user inserts a typo in the clause, like "MATH" instead of "MATCH," the clause will not reach the parser. To solve this problem, I am thinking of a better way to differentiate a Cypher clause from a SQL one. Is there a way to do this in C?


Solution

  • We have solved this if anyone is interested. Instead of doing the string comparison in the C file, we have used variable checking which is done from the parser file instead.

    The user input will be passed into the Cypher parser regardless if it is a Cypher or an SQL query, and only sends the input to the server as a Cypher command if the parser returns a success. For the parser to return a success, we have assigned each Cypher clause with a boolean variable which will be set to true only if the grammar rules are satisfied for the specific command entered. If no match has occurred, the variables will stay false which the parser will then return unsuccessful.

    For clarification, here is a snippet of the parser:

    %{
    /* include statements/*
    ...
    bool match = false;
    bool set = false;
    bool set_path = false;
    bool create = false;
    bool drop = false;
    bool alter = false;
    bool load = false;
    ...
    %}
    
    ...
    
    %%
    statement:
        query
        | statement query
        | statement SEMICOLON { YYACCEPT; }
        ;
    
    query:
        match_clause
        | create_clause { create = true; }
        | drop_clause { drop = true; }
        | alter_clause { alter = true; }
        | load_clause { load = true; }
        | set_clause { set = true; }
        ...
        ;
    
    ...
    %%
    
    ...
    
    bool
    psql_scan_cypher_command(char* data)
    {
        ...
    
        YY_BUFFER_STATE buf = yy_scan_string(data);
        yypush_buffer_state(buf);
        yyparse();
    
        if (match || optional || explain || create || drop || alter || load ||
            set || set_path || merge || rtn || unwind || prepare || execute)
            return true;
    
        return false;
    }
    
    ...
    

    Refer to the 'cypher.y' and 'mainloop.c' files for complete reference.