id | name | DESCRIPTION | ACTIVE | UPDATED_JSON |
---|---|---|---|---|
id1 | name-1 | desc-1 | true | {"diffFields": [{"fieldName": "name","valueAfter": "new-segment-name-1","valueBefore": null},{"fieldName": "active","valueAfter": true,"valueBefore": null}],"segmentId": "b204c220-ea8d-4cf4-b579-30eb59a1a2a4"} |
id2 | name-2 | desc-2 | true | {"diffFields": [{"fieldName": "name","valueAfter": "new-segment-name-2","valueBefore": null},{"fieldName": "active","valueAfter": true,"valueBefore": null}],"segmentId": "b204c220-ea8d-4cf4-b579-30eb59a1a2a4"} |
I have a table of the above structure in snowflake. UPDATED_JSON
is a variant column. I want to change this table to have the structure similar to the one below.
In UPDATED_JSON
I have fieldName
, when its value is name
I need to update the name
column to have valueAfter
data. diffFields
is not ordered. If name
in updated_json
is not present, I want to leave name
column with its current value.
in the below example , name-1
changed to new-segment-name-1
because UPDATED_JSON
has a fieldName
with value name
and valueAfter
with value new-segment-name-1
id | name | DESCRIPTION | ACTIVE |
---|---|---|---|
id1 | new-segment-name-1 | desc-1 | true |
id2 | new-segment-name-2 | desc-2 | true |
I am trying to to this with dbt
you data as a CTE:
WITH data(id, name, DESCRIPTION, ACTIVE, UPDATED_JSON) as (
select column1, column2, column3, column4, parse_json(column5) from values
('id1', 'name-1', 'desc-1', true,'{"diffFields": [{"fieldName": "name","valueAfter": "new-segment-name-1","valueBefore": null},{"fieldName": "active","valueAfter": true,"valueBefore": null}],"segmentId": "b204c220-ea8d-4cf4-b579-30eb59a1a2a4"}'),
('id2', 'name-2', 'desc-2', true, '{"diffFields": [{"fieldName": "name","valueAfter": "new-segment-name-2","valueBefore": null},{"fieldName": "active","valueAfter": true,"valueBefore": null}],"segmentId": "b204c220-ea8d-4cf4-b579-30eb59a1a2a4"}')
)
select id
,max(iff(f.value:fieldName::text = 'name', f.value:valueAfter::text, null)) as name
,DESCRIPTION
,active
from data, table(flatten(input=>UPDATED_JSON:diffFields)) f
group by 1,3,4;
gives:
ID | NAME | DESCRIPTION | ACTIVE |
---|---|---|---|
id2 | new-segment-name-2 | desc-2 | TRUE |
id1 | new-segment-name-1 | desc-1 | TRUE |