Search code examples
jsonamazon-redshiftjson-extract

JSON fields in Redshift


I am wanting to split out 'housing' key onto different rows. So in this example I would expect 4 rows. Within each row would have the following columns:

id (this is the key on the table), housing.id, housing.name, user.id, hours, isplanningtime

Example of JSON.

[{"housing":{"id":"02d3ca45-36b3-4114-91b7-39ed12e1548b","name":"texting"},"user":{"id":"75bd4cad-acc9-420d-9d5e-4d2851a4b9c4","name":"person_person","jobTitle":"Sales Manager","avatar":null,"email":"[email protected]","disabled":false},"hours":4,"isPlanningTime":false},{"housing":{"id":"02d3ca45-36b3-4114-91b7-39ed12e1548b","name":"testing"},"user":null,"hours":4,"isPlanningTime":false},{"housing":{"id":"02d3ca45-36b3-4114-91b7-39ed12e1548b","name":"testing"},"user":null,"hours":4,"isPlanningTime":false},{"housing":{"id":"02d3ca45-36b3-4114-91b7-39ed12e1548b","name":"testing"},"user":null,"hours":4,"isPlanningTime":false}]

I have managed to do this with the following code


    SELECT cd.id,
           json_extract_path_text(json_extract_array_element_text(cd.value, 0), 'housing', 'id') AS housing_id_0,
           json_extract_path_text(json_extract_array_element_text(cd.value, 0), 'housing', 'name') AS housing_name_0,
           json_extract_path_text(Json_extract_array_element_text(cd.value, 0), 'user', 'id') AS user_id_0,
           json_extract_path_text(json_extract_array_element_text(cd.value, 0), 'hours') AS hours_0,
           json_extract_path_text(json_extract_array_element_text(cd.value, 0), 'isPlanningTime') AS isPlanningTime_0
    FROM "data" cd
    LEFT JOIN "field" c ON c.id = cd.field_id
    WHERE c.name = 'time'

Using Union All, I copied the code and changed

    json_extract_array_element_text(cd.value, 1)
    json_extract_array_element_text(cd.value, 2)

and so on.

The problem is there could be 100+ instances of 'housing' in the JSON.

Is there a better way of doing it?

Thanks


Solution

  • You are unnesting a super array and the AWS docs describe how to do this:

    https://docs.aws.amazon.com/redshift/latest/dg/query-super.html

    This will basically make a new row for each element in the top array of your data. Then you can apply your selection SQL of the individual fields that you want to extract.