Search code examples
azurepysparkapache-spark-sqlazure-synapse

How to Identify This Pyspark Coding Issue - No viable alternative at input


Trying to run the following code in Azure Synapse pyspark and receive the parsing error, it doesn't seem like Synapse accept the double brackets, anyone know how to fix it?

def curated_report(entity_name):
    sqlstr ="WITH Participant_Name \
    AS (SELECT \
    CASEID, \
    PARTICIPANTID, \
    LASTWRITTEN, \
    PARTICIPANT, \
    FIRSTNAME, \
    MIDDLENAME, \
    LASTNAME \
    FROM (SELECT \
    ab.CASEID, \
    ab.PARTICIPANTID, \
    ab.DYNAMICDATATYPE, \
    ab.DYNAMICEVIDENCEVALUE, \
    ab.LASTWRITTEN \
    FROM a.ev ab \
    INNER JOIN (SELECT \
    PARTICIPANTID, \
    MAX(LASTWRITTEN) AS MAXDATE \
    FROM a.bd \
    where TYPE in ( 'PDC001' ) \
    GROUP BY PARTICIPANTID) cd \
    ON ab.PARTICIPANTID = cd.PARTICIPANTID \
    AND ab.LASTWRITTEN = cd.MAXDATE \
    GROUP BY ab.CASEID, \
    ab.PARTICIPANTID, \
    ab.DYNAMICDATATYPE, \
    ab.DYNAMICEVIDENCEVALUE, \
    ab.LASTWRITTEN) AS SOURCE \
    PIVOT(max(DYNAMICEVIDENCEVALUE) \
    FOR DYNAMICDATATYPE IN (PARTICIPANT, \
    FIRSTNAME, \
    MIDDLENAME,\
    LASTNAME) \
    )AS RESULT) \         <----*this line seems to be causing error*
    SELECT* \
    FROM PARTICIPANT_NAME"
    df = spark.sql(sqlstr)
    return df


*solved.

Solution

  • ParseException: no viable alternative at input 'WITH Participant_Name AS (SELECT ...

    Remove as RESULT from the query. For CTE table, it is sufficient to give with CTE table name.

    Error Screenshot I tried to repro this with similar script and got the same error. enter image description here

    • In order to avoid this error, I removed the alias name as Result from the Query and it is executed successfully.

    Corrected Code

    sqlstr ="WITH Participant_Name \
        AS (SELECT \
        CASEID, \
        PARTICIPANTID, \
        LASTWRITTEN, \
        PARTICIPANT, \
        FIRSTNAME, \
        MIDDLENAME, \
        LASTNAME \
        FROM (SELECT \
        ab.CASEID, \
        ab.PARTICIPANTID, \
        ab.DYNAMICDATATYPE, \
        ab.DYNAMICEVIDENCEVALUE, \
        ab.LASTWRITTEN \
        FROM enhanced.BDMCASEEVIDENCE ab \
        INNER JOIN (SELECT \
        PARTICIPANTID, \
        MAX(LASTWRITTEN) AS MAXDATE \
        FROM enhanced.BDMCASEEVIDENCE \
        where EVIDENCETYPE in ( 'PDC0000258' ) \
        GROUP BY PARTICIPANTID) cd \
        ON ab.PARTICIPANTID = cd.PARTICIPANTID \
        AND ab.LASTWRITTEN = cd.MAXDATE \
        GROUP BY ab.CASEID, \
        ab.PARTICIPANTID, \
        ab.DYNAMICDATATYPE, \
        ab.DYNAMICEVIDENCEVALUE, \
        ab.LASTWRITTEN) AS SOURCE \
        PIVOT(max(DYNAMICEVIDENCEVALUE) \
        FOR DYNAMICDATATYPE IN ('PARTICIPANT', \
        'FIRSTNAME', \
        'MIDDLENAME',\
        'LASTNAME') \
        )) \         
        SELECT* \
        FROM PARTICIPANT_NAME"
        df = spark.sql(sqlstr)