Search code examples
sqlsnowflake-cloud-data-platformdbt

How to map object/json array in Snowflake SQL / DBT Macro?


id some_attribute json_array
1 "abc" [ { attr: 'apple'}, { attr: 'banana' } ]

How to get the get rid of attr in json_array so that the table results into something like table below?

id some_attribute string_array
1 "abc" [ 'apple', 'banana' ]

Use case is during the cleaning stage of the data to make further processing and analysis simpler in later stages of the pipeline.

Thx for the help!


Solution

  • Another option is to create a JavaScript UDF. For example

    CREATE OR REPLACE FUNCTION ARRAY_JSON_VALUES("a" ARRAY, "attr" STRING) 
    RETURNS ARRAY 
    LANGUAGE JAVASCRIPT RETURNS NULL ON NULL INPUT IMMUTABLE
    AS 
    $$ 
      return a.map(e => e[attr]);
    $$
    

    then

    WITH data AS(
      SELECT 1 id, 'abc' as some_attribute, [{ 'attr': 'apple'}, { 'attr': 'banana' } ] as json_array
    )
    SELECT 
      id
    , some_attribute
    , ARRAY_JSON_VALUES(json_array,'attr') as string_array
    FROM
      data
    

    again returns

    ID|SOME_ATTRIBUTE|STRING_ARRAY      |
    --+--------------+------------------+
     1|abc           |["apple","banana"]|