Search code examples
mysqlsqlmergeteradatateradata-sql-assistant

Conversion of merge statement to MYSQL using on duplicate key


Please suggest how to convert this teradata statement in MYSQL. As we know mysql doesn't support merge statement. Below 2 tables are also being used in select query and we have multiple primary key in each table.

MERGE INTO XYZ USING (
            SELECT
                    ITRR.WORKFLOW_NAME WORKFLOW_NAME
            ,       ITRR.INSTANCE_NAME INSTANCE_NAME
            ,       MIN(ITRR.START_TIME) EARLIEST_START_TIME
            ,       ITRR.SUBJECT_AREA SUBJECT_AREA
            ,       'INFORMATICA' PLATFORM_NAME
            FROM
                    ABC IWRR
            ,       DEF ITRR
            WHERE
                    IWRR.WORKFLOW_RUN_ID = ITRR.WORKFLOW_RUN_ID
            AND     IWRR.USER_NAME IN ('xyz')
            AND     ITRR.RUN_STATUS_CODE <> 2
            GROUP BY
                    ITRR.WORKFLOW_NAME
            ,       ITRR.INSTANCE_NAME
            ,       ITRR.SUBJECT_AREA
    ) SRC
    ON
            XYZ.PARENT_JOB_NAME = SRC.WORKFLOW_NAME
    AND     XYZ.CHILD_JOB_NAME  = SRC.INSTANCE_NAME
    AND     XYZ.SANDBOX         = SRC.SUBJECT_AREA
    WHEN MATCHED THEN UPDATE SET FIRST_EXECUTION = SRC.EARLIEST_START_TIME
    WHEN NOT MATCHED THEN INSERT
    (
            PARENT_JOB_NAME
    ,       CHILD_JOB_NAME
    ,       FIRST_EXECUTION
    ,       SANDBOX
    ,       PLATFORM_NAME
    )VALUES
    (
            SRC.WORKFLOW_NAME
    ,       SRC.INSTANCE_NAME
    ,       SRC.EARLIEST_START_TIME
    ,       SRC.SUBJECT_AREA
    ,       SRC.PLATFORM_NAME
    );

I am trying below query but it is not working.

INSERT INTO XYZ (
                PARENT_JOB_NAME
,       CHILD_JOB_NAME
,       FIRST_EXECUTION
,       SANDBOX
,       PLATFORM_NAME
    )

       (SELECT
                ITRR.WORKFLOW_NAME WORKFLOW_NAME
        ,       ITRR.INSTANCE_NAME INSTANCE_NAME
        ,       MIN(ITRR.START_TIME) EARLIEST_START_TIME
        ,       ITRR.SUBJECT_AREA SUBJECT_AREA
        ,       'INFORMATICA' PLATFORM_NAME
        FROM
                ABC IWRR
        ,       DEF ITRR
        WHERE
                IWRR.WORKFLOW_RUN_ID = ITRR.WORKFLOW_RUN_ID
        AND     IWRR.USER_NAME IN ('XYZ')
        AND     ITRR.RUN_STATUS_CODE <> 2
        GROUP BY
                ITRR.WORKFLOW_NAME
        ,       ITRR.INSTANCE_NAME
        ,       ITRR.SUBJECT_AREA
 ) SRC
ON DUPLICATE KEY UPDATE
       FIRST_EXECUTION = SRC.EARLIEST_START_TIME

Primary key of XYZ = PARENT_JOB_NAME

Primary key of ABC= SUBJECT_ID

Primary key of DEF= SUBJECT_ID,WORKFLOW_ID,WORKFLOW_RUN_ID,WORKLET_RUN_ID,INSTANCE_ID,TASK_ID,START_TIME


Solution

  • The correct syntax in MySQL is:

    INSERT INTO XYZ (PARENT_JOB_NAME, CHILD_JOB_NAME, FIRST_EXECUTION, SANDBOX, PLATFORM_NAME)
        SELECT ITRR.WORKFLOW_NAME, ITRR.INSTANCE_NAME,
               MIN(ITRR.START_TIME), ITRR.SUBJECT_AREA, 'INFORMATICA'
        FROM ABC IWRR JOIN
             DEF ITRR
             ON IWRR.WORKFLOW_RUN_ID = ITRR.WORKFLOW_RUN_ID 
        WHERE IWRR.USER_NAME IN ('XYZ') AND
              ITRR.RUN_STATUS_CODE <> 2
        GROUP BY ITRR.WORKFLOW_NAME, ITRR.INSTANCE_NAME, ITRR.SUBJECT_AREA
    ON DUPLICATE KEY UPDATE FIRST_EXECUTION = VALUES(FIRST_EXECUTION);
    

    Note the use of proper, explicit, standard, readable JOIN syntax. Use it.

    The major changes are

    • Fixing the archaic syntax.
    • Removing the parentheses are not needed for the select in an insert . . . select (although they are probably allowed).
    • Removing the table alias, which is definitely not allowed.
    • Fixing the on duplicate key statement.