Search code examples
snowflake-cloud-data-platformdbt

how to convert snowflake table to different structure


id name DESCRIPTION ACTIVE UPDATED_JSON
id1 name-1 desc-1 true {"diffFields": [{"fieldName": "name","valueAfter": "new-segment-name-1","valueBefore": null},{"fieldName": "active","valueAfter": true,"valueBefore": null}],"segmentId": "b204c220-ea8d-4cf4-b579-30eb59a1a2a4"}
id2 name-2 desc-2 true {"diffFields": [{"fieldName": "name","valueAfter": "new-segment-name-2","valueBefore": null},{"fieldName": "active","valueAfter": true,"valueBefore": null}],"segmentId": "b204c220-ea8d-4cf4-b579-30eb59a1a2a4"}

I have a table of the above structure in snowflake. UPDATED_JSON is a variant column. I want to change this table to have the structure similar to the one below. In UPDATED_JSON I have fieldName, when its value is name I need to update the name column to have valueAfter data. diffFields is not ordered. If name in updated_json is not present, I want to leave name column with its current value.

in the below example , name-1 changed to new-segment-name-1 because UPDATED_JSON has a fieldName with value name and valueAfter with value new-segment-name-1

id name DESCRIPTION ACTIVE
id1 new-segment-name-1 desc-1 true
id2 new-segment-name-2 desc-2 true

I am trying to to this with dbt


Solution

  • you data as a CTE:

    WITH data(id, name, DESCRIPTION, ACTIVE, UPDATED_JSON) as (
        select column1, column2, column3, column4, parse_json(column5) from values    
        ('id1', 'name-1', 'desc-1', true,'{"diffFields": [{"fieldName": "name","valueAfter": "new-segment-name-1","valueBefore": null},{"fieldName": "active","valueAfter": true,"valueBefore": null}],"segmentId": "b204c220-ea8d-4cf4-b579-30eb59a1a2a4"}'),
       ('id2', 'name-2', 'desc-2', true, '{"diffFields": [{"fieldName": "name","valueAfter": "new-segment-name-2","valueBefore": null},{"fieldName": "active","valueAfter": true,"valueBefore": null}],"segmentId": "b204c220-ea8d-4cf4-b579-30eb59a1a2a4"}')
    )
    
    select id
        ,max(iff(f.value:fieldName::text = 'name', f.value:valueAfter::text, null))  as name
        ,DESCRIPTION
        ,active
    from data, table(flatten(input=>UPDATED_JSON:diffFields)) f
    group by 1,3,4;       
    

    gives:

    ID NAME DESCRIPTION ACTIVE
    id2 new-segment-name-2 desc-2 TRUE
    id1 new-segment-name-1 desc-1 TRUE