Search code examples
postgresqljsonpath

PostgreSQL query on Json having nested Array


I have below json data stored in Postgres DB with column as json:

  {"users": [
    {
      "id": 1,
      "data": [
        {
          "pos": "BA",
          "endDate": "2022-07-31",
          "startDate": "2022-01-01"
        },
        {
          "pos": "BA2",
          "endDate": "2022-09-30",
          "startDate": "2022-08-01"
        },
        {
          "pos": "BA3",
          "endDate": "2023-03-31",
          "startDate": "2022-10-01"
        },
        {
          "pos": "BA4",
          "endDate": "2023-06-08",
          "startDate": "2023-04-01"
        }
      ]
    },
    {
      "id": 2,
      "data": [
        {
          "pos": "BA",
          "endDate": "2022-07-31",
          "startDate": "2022-01-01"
        },
        {
          "pos": "BA2",
          "endDate": "2022-09-30",
          "startDate": "2022-08-01"
        },
        {
          "pos": "BA3",
          "endDate": "2023-03-31",
          "startDate": "2022-10-01"
        },
        {
          "pos": "BA4",
          "endDate": "2023-06-08",
          "startDate": "2023-04-01"
        }
      ]
    },
    {
      "id": 3,
      "data": [
        {
          "pos": "BA",
          "endDate": "2022-07-31",
          "startDate": "2022-01-01"
        },
        {
          "pos": "BA2",
          "endDate": "2022-09-30",
          "startDate": "2022-08-01"
        },
        {
          "pos": "BA3",
          "endDate": "2023-03-31",
          "startDate": "2022-10-01"
        },
        {
          "pos": "BA4",
          "endDate": "2023-06-08",
          "startDate": "2023-04-01"
        }
      ]
    }
  ]
}

I need to write a query to filter the data based on the startDate and endDate. So if the startDate is 2022-01-01 and endDate is 2022-12-31 then query should return below json:

  {"users": [
    {
      "id": 1,
      "data": [
        {
          "pos": "BA",
          "endDate": "2022-07-31",
          "startDate": "2022-01-01"
        },
        {
          "pos": "BA2",
          "endDate": "2022-09-30",
          "startDate": "2022-08-01"
        }
      ]
    },
    {
      "id": 2,
      "data": [
        {
          "pos": "BA",
          "endDate": "2022-07-31",
          "startDate": "2022-01-01"
        },
        {
          "pos": "BA2",
          "endDate": "2022-09-30",
          "startDate": "2022-08-01"
        }
      ]
    },
    {
      "id": 3,
      "data": [
        {
          "pos": "BA",
          "endDate": "2022-07-31",
          "startDate": "2022-01-01"
        },
        {
          "pos": "BA2",
          "endDate": "2022-09-30",
          "startDate": "2022-08-01"
        }
      ]
    }
  ]
}

Can someone help me with the query ?


Solution

  • Well, this is a bit complicated. Inside-out, first flatten the JSON field then filter and finally re-aggregate into a JSON object again. Please note however that it is worth reviewing the data design. A normalized one would make things way better and easier. Please also note that it is extremely difficult to maintain data quality and consistency in a complex composite JSON field while supported out-of-the-box in a proper normalized model.

    select json_build_object('users', json_agg(t))
    from
    (
      select jvalue ->> 'id' as id, json_agg(j) as data
      from
      json_array_elements
      (
        '{"users":[
          {"id":1,"data":[{"pos":"BA","endDate":"2022-07-31","startDate":"2022-01-01"},{"pos":"BA2","endDate":"2022-09-30","startDate":"2022-08-01"},{"pos":"BA3","endDate":"2023-03-31","startDate":"2022-10-01"},{"pos":"BA4","endDate":"2023-06-08","startDate":"2023-04-01"}]},
          {"id":2,"data":[{"pos":"BA","endDate":"2022-07-31","startDate":"2022-01-01"},{"pos":"BA2","endDate":"2022-09-30","startDate":"2022-08-01"},{"pos":"BA3","endDate":"2023-03-31","startDate":"2022-10-01"},{"pos":"BA4","endDate":"2023-06-08","startDate":"2023-04-01"}]},
          {"id":3,"data":[{"pos":"BA","endDate":"2022-07-31","startDate":"2022-01-01"},{"pos":"BA2","endDate":"2022-09-30","startDate":"2022-08-01"},{"pos":"BA3","endDate":"2023-03-31","startDate":"2022-10-01"},{"pos":"BA4","endDate":"2023-06-08","startDate":"2023-04-01"}]}
        ]}'::json -> 'users'
      ) as jvalue
      cross join lateral json_array_elements(jvalue -> 'data') as j(value_object)
      where (value_object ->> 'startDate')::date >= '2022-01-01' 
        and (value_object ->> 'endDate')::date <= '2022-12-31'
      group by id
    ) as t;