Search code examples
sqlarraysjsonsnowflake-cloud-data-platformquerying

Flatten and reconstruct JSON Snowflake


I am still learning Snowflake, any help would be really appreciated.

I have a column, let's call it 'result'.

{
  "catalog": [
    {
      "img_href": "https://schumacher-webassets.s3.amazonaws.com/Web%20Catalog-600/179361.jpg",
      "name": "ADITI HAND BLOCKED PRINT",
      "price": 16
    },
    {
      "img_href": "https://schumacher-webassets.s3.amazonaws.com/Web%20Catalog-600/179330.jpg",
      "name": "TORBAY HAND BLOCKED PRINT",
      "price": 17
    },
    {
      "img_href": "https://schumacher-webassets.s3.amazonaws.com/Web%20Catalog-600/179362.jpg",
      "name": "ADITI HAND BLOCKED PRINT",
      "price": 18
    }
  ],
  "datetime": 161878993658
  "catalog_id": 1
}

I would like to flatten it and reconstruct as below

[
  {
    "datetime": 161878993658,
    "url": "https://schumacher-webassets.s3.amazonaws.com/Web%20Catalog-600/179361.jpg"
  },
  {
    "datetime": 161878993658,
    "url": "https://schumacher-webassets.s3.amazonaws.com/Web%20Catalog-600/179330.jpg"
  },
  {
    "datetime": 161878993658,
    "url": "https://schumacher-webassets.s3.amazonaws.com/Web%20Catalog-600/179362.jpg"
  },
]

Solution

  • The following will do this. You won't need the CTE, so delete it and replace uses of tbl with the name of your table and uses of json with your variant column.

    /*delete this line*/ with tbl as (select parse_json($1) json from values('{"catalog":[{"img_href":"https://schumacher-webassets.s3.amazonaws.com/Web%20Catalog-600/179361.jpg","name":"ADITI HAND BLOCKED PRINT","price":16},{"img_href":"https://schumacher-webassets.s3.amazonaws.com/Web%20Catalog-600/179330.jpg","name":"TORBAY HAND BLOCKED PRINT","price":17},{"img_href":"https://schumacher-webassets.s3.amazonaws.com/Web%20Catalog-600/179362.jpg","name":"ADITI HAND BLOCKED PRINT","price":18}],"datetime":161878993658,"catalog_id":1}'))
    
    select array_agg(new_col) reconstructed
    from (
      /* replace json and tbl */ select object_construct('datetime', json:datetime, 'url', obj.value:img_href) new_col, json:catalog_id catalog_id
      from tbl, lateral flatten(json:catalog) obj
    ) group by catalog_id;
    

    It outputs

    [
      {
        "datetime": 161878993658,
        "url": "https://schumacher-webassets.s3.amazonaws.com/Web%20Catalog-600/179361.jpg"
      },
      {
        "datetime": 161878993658,
        "url": "https://schumacher-webassets.s3.amazonaws.com/Web%20Catalog-600/179330.jpg"
      },
      {
        "datetime": 161878993658,
        "url": "https://schumacher-webassets.s3.amazonaws.com/Web%20Catalog-600/179362.jpg"
      }
    ]