Search code examples
sqlstringconcatenationsnowflake-cloud-data-platform

Snowflake CONCAT strings with DISTINCT values


I am doing a merge where a string exists in my target table.

To this I want to concat a new delta value which is a ; separated string list in my

WHEN MATCHED .... UPDATE clause

eg.

set TARGET_VALUE = 'abc';

set DELTA = 'abc;def';

After concat:

OUTPUT should be abc;def;


select CONCAT($TARGET_VALUE,';', $DELTA);



  Gives me    abc;abc;def.

Is there a way to achieve this inline without converting the strings to Array and following: How to concatenate arrays in Snowflake with distinct values? ?

I want to avoid creating a custom function or snippet and check if there is an inline way to achieve this.

I wrote a UDF to achieve this, but is there an inline way ?

set TARGET_VALUE = 'xyhx;abc';

set DELTA = 'abc;def';

select string_distinct_concat($TARGET_VALUE, $DELTA);

// Distinct Concatenate
create or replace function string_distinct_concat(target varchar, delta varchar)
    returns variant
    language javascript
    comment = 'Returns a distinct concatenation of two strings'
as
$$

     if (TARGET == null) {
        return DELTA
     }
      
     if (DELTA == null) {
        return TARGET
     }
     const target_array = TARGET.split(';');
     const delta_array = DELTA.split(';');
     const set = new Set([...target_array,...delta_array])
     return [...set].join(';');
$$
;

Solution

  • Here is one approach using split_to_table and listagg with distinct.

    set TARGET_VALUE = 'abc';
    
    set DELTA = 'abc;def';
    
    select listagg(distinct t.value,';') final_val from 
    table(split_to_table(concat($DELTA,';',$TARGET_VALUE),';')) t;
    
    FINAL_VAL
    abc;def

    Set target with partial string as in DELTA. This should be included in final output.

    set TARGET_VALUE = 'ab';
    
    select listagg(distinct t.value,';') final_val from 
    table(split_to_table(concat($DELTA,';',$TARGET_VALUE),';')) t;
    
    FINAL_VAL
    abc;def;ab

    Here are two approaches for NULL handling, I prefer CTE approach

    with var_cte as
    (
    select column1 as var from values 
    ($DELTA),
    ($TARGET_VALUE)
    ), no_null_cte as
    (
    select t.value from var_cte, table(split_to_table(var,';')) t
    )select listagg(distinct value,';') from no_null_cte;
    

    Using NVL -

    select listagg(distinct t.value,';') from table
    (
    split_to_table(concat(nvl($DELTA,''),';',nvl($TARGET_VALUE,'')),';')) t 
    where t.value != '';