Search code examples
google-bigquerygoogle-cloud-sqlupsert

Cloud SQL to BigQuery Upsert Query using external connection


Referring to this link BigQuery dynamic UPSERT with EXECUTE IMMEDIATE I just wanted to follow same procedure to periodically Insert or Updated any changed data into BigQuery but from external Cloud SQL connection. I have established the connection with Cloud SQL. Now that I just need query that works well for me to achieve this. How do I modify above linked solution to use EXTERNAL_QUERY() as data source?


Solution

  • I recreated your issue and here's the solution where I have written a procedure to take values dynamically from Cloud SQL using EXTERNAL_QUERY() and insert or update based on conditions.

    CREATE OR REPLACE PROCEDURE `<YOUR_BIGQUERY_DATASET>.LoadDataFromCloudSQL`()
    BEGIN
      DECLARE sql_query STRING;
      DECLARE column_names STRING;
      DECLARE merge_statement STRING;
      DECLARE transformed_columns STRING;
    
    
      
      -- Retrieve data from Cloud SQL using EXTERNAL_QUERY()
      SET sql_query = "SELECT * FROM EXTERNAL_QUERY('<EXTERNAL_CONNECTION_NAME>', '''SELECT * FROM <YOUR_TABLE_NAME> WHERE created_at > now() - interval '24 hours';''')";
    
      -- Get the column names from the source table
      SET column_names = (
        SELECT STRING_AGG(column_name, ', ')
        FROM (
          SELECT column_name
          FROM `<YOUR_BIGQUERY_TABLE>.INFORMATION_SCHEMA.COLUMNS`
          WHERE table_name = '<YOUR_TABLE_NAME>'
        )
      );
    
      -- Get the columns for update query
      SET transformed_columns = (
        SELECT STRING_AGG(CONCAT(column, ' = source.', column), ', ')
        FROM UNNEST(SPLIT(column_names, ',')) AS column
      );
    
    
      -- Generate the MERGE statement dynamically
      SET merge_statement = CONCAT('''
        MERGE `<YOUR_BIGQUERY_TABLE>` AS target
        USING (
          ''', sql_query, '''
        ) AS source
        ON target.id = source.id
        WHEN MATCHED THEN
          UPDATE SET ''',transformed_columns,'''
        WHEN NOT MATCHED THEN
          INSERT (''', column_names, ''')
          VALUES (''', column_names, ''')
      ''');
    
      -- Insert new rows and update existing rows in BigQuery table
      EXECUTE IMMEDIATE merge_statement;
    
      -- Output the completion message
      SELECT 'Data transfer completed.' AS result;
    END;