Search code examples
sqlamazon-redshiftdbtdata-transform

Extract key value pairs with the same key in redshift


I have a requirement where I have a json array on each cell of the column. I am trying to find a way to extract each instance of the same key value pair. For example, I would like to extract each occurrence of 'product_id' from the following array:

[{"id": 11993176146155, "fulfillable_quantity": 0, "product_id": 7538905317611, "properties": [], "quantity": 1, "requires_shipping": true, "taxable": false, "total_discount": "0.00", "total_discount_set": {"shop_money": {"amount": "0.00", "currency_code": "PKR"}, "presentment_money": {"amount": "0.00", "currency_code": "PKR"}}}, {"id": 11993176178923, "price_set": {"shop_money": {"amount": "450.00", "currency_code": "PKR"}, "presentment_money": {"amount": "450.00", "currency_code": "PKR"}}, "product_exists": true, "product_id": 7018040164543}]

I want to create a new column called 'Product_ID' and store the extracted values in this column. My final result should look like this:

enter image description here

Is there a way in redshift to iterate over the entire array for each row and extract the desired values (in this case it is product IDs)? I have looked at some functions on redshift but unfortunately those didn't help. Would be great if someone could provide a solution.


Solution

  • You need to unnest the array and then select the keys you want.

    There is a great tutorial for unnesting arrays in Redshift here: https://blog.getdbt.com/how-to-unnest-arrays-in-redshift/