Search code examples
jsonmergesnowflake-cloud-data-platformvariant

How to merge variant column data from two different tables in snowflake


I have two tables with around thousands of column each.Table contains variant data columns.I want to merge the variant data column based on key value in variant data.

eg:

Table 1 column name: SRC value : {col1:val1,col2:val2,col3:val3}

Table 2: column name: SRC value : {col1:val1,col2:val2,col4:val4}

I want output after merging : Table 1 : SRC = {col1:val1,col2:val2,col3:val3,col4:val4}

I want existing keys in variant column to update and new keys to insert in table.

I have tried it with object_insert() method but it can update a single field at a time and it is being hard to implement syntactically for thousand of columns.How to achieve this with multiple fields. Can anyone help me on this?


Solution

  • You can use a Javascript UDF to do this. Here is a naive example that you can build on that does a very simple shallow merge:

    -- Create an example table
    create or replace transient table test_table (column1 VARIANT, column2 VARIANT);
    
    -- Insert some sample data
    INSERT OVERWRITE INTO test_table (column1, column2)
    select PARSE_JSON('{"a": "row1_val_a", "c": "row1_val_c"}'), PARSE_JSON('{"b": "row1_val_b"}')
    union all
    select PARSE_JSON('{"a": "row2_val_a", "b": "row2_val_b"}'), PARSE_JSON('{"c": "row2_val_c"}')
    ;
    
    -- Have a look at the table
    select * from test_table;
    
    -- Create the javascript UDF that merges the two columns
    CREATE OR REPLACE FUNCTION merge_json(OBJ_1 VARIANT, OBJ_2 VARIANT)
        RETURNS variant
        LANGUAGE JAVASCRIPT
        AS 
        $$
        function extend(obj, src) {
          for (var key in src) {
              if (src.hasOwnProperty(key)) obj[key] = src[key];
          }
          return obj;
        }
        return extend(OBJ_1, OBJ_2)
        $$;
    
    -- Use the UDF in a SQL query
    select merge_json(column1, column2) from test_table;
    

    This example assumes that the VARIANT objects are in the same table just in separate columns. You could change it to grab the Variants from different tables pretty easily.