Search code examples
sqldatabricksparseexception

"WITH x AS " ParseException Error in DataBricks notebook


I'm trying to JOIN two tables in a DataBricks Notebook. The first line in the SQL statement is erroring-out.

I can't determine why. The docs I've read say its typically due to a typo. But that is not the case for me (at least not that I can see).

Error in SQL statement: ParseException: 
no viable alternative at input 'WITH mgt '(line 1, pos 8)

== SQL ==
WITH xxx AS(
--------^^^

Solution

  • This can be caused by characters that look like ordinary space (0x20), but are something different. Unicode has quite a lot of them and it happens, especially on copy-paste, that some weird formatting corrupts SQL query string.

    For example:

    val sql = "WITH mgt AS(select 1) select * from mgt"
    spark.sql(sql)
    
    org.apache.spark.sql.catalyst.parser.ParseException:
    no viable alternative at input 'WITH mgt '(line 1, pos 8)
    
    == SQL ==
    WITH mgt AS(select 1) select * from mgt
    --------^^^
    

    Why did that occur? We can discover by looking at exact byte representation of SQL:

    sql.getBytes("utf-8").map("%02X".format(_)).mkString
    57495448206D6774E2808041532873656C6563742031292073656C656374202A2066726F6D206D6774
                    ^^^^^^
    

    Marked sequence is 0xE28080 - which is En Quad, not a space. You can backspace it and type again to fix it.