Search code examples
sqlarrayssnowflake-cloud-data-platform

How to return only certain objects and key-value pairs from array in Snowflake


Consider the following table in Snowflake:

Column_Name Type
order_id int
cust_id varchar
details variant

Sample Data:

with sample_data as (
 select 1 as order_id, 'aaa' as cust_id, 
    '{
      "channel": "phone",
      "result": "approved", 
      "reason": null,
      "pairs": [
        {
          "key": "size",
          "value": "large"
        },
        {
          "key": "color",
          "value": "red"
        },
        {
          "key": "fit",
          "value": "slim"
        },
        {
          "key": "pattern",
          "value": "zebra"
        }
      ]
    }' as details union all
 select 2 as order_id, 'bbb' as cust_id, 
    '{
      "channel": "store",
      "result": "denied", 
      "reason": "stock",
      "pairs": [
        {
          "key": "size",
          "value": null
        },
        {
          "key": "pattern",
          "value": "tiger"
        }
      ]
    }' as details
)
select *
from sample_data

Each array may contain dozens or hundreds of objects and key-value pairs, but I only want certain ones. So from the sample data, I would like to extract only the channel object, result object and certain pairs (where key is size or color).

Desired Output for ORDER_ID 1:

{
  "channel": "phone",
  "result": "approved", 
  "pairs": [
    {
      "key": "size",
      "value": "large"
    },
    {
      "key": "color",
      "value": "red"
    }
  ]
}

Desired Output for ORDER_ID 2:

{
  "channel": "store",
  "result": "denied", 
  "pairs": [
    {
      "key": "size",
      "value": null
    }
  ]
}

The full desired output would look like:

order_id cust_id details
1 aaa the desired array above for order 1
2 bbb the desired array above for order 2

I can use lateral_flatten() within a CTE to allow me to then filter the "pairs" that I want, but I'm not sure how to: (1) put the desired pairs back together as an array, or; (2) add "channel" and "result" objects back to my modified array. I'm not using array_construct in the right manner as it's returning a row for every object. Was hoping it could be used like a windows fuction (partition by) but it doesn't work that way.


Solution

  • Using a higher order function FILTER and object functions(pick/insert) to build required structure:

    SELECT t.order_id,
           t.cust_id,
           OBJECT_INSERT(OBJECT_PICK(t.details, ['channel', 'result']), 'pairs', 
                        FILTER(t.details:pairs, a->a:key::TEXT IN ('size', 'color'))
           ) AS details 
    FROM sample_data t;
    

    Output:

    enter image description here