Search code examples
c#azure-cosmosdbazure-cosmosdb-sqlapi

How to filter Azure Cosmos DB child arrays


I have the following data in Azure Cosmos DB. I want to filter the data based on between data.date and contains employee.full_name. The following query give undesired result. I really need help to solve this problem. This link How to query nested array with Cosmos DB doesn't seem to fit my expected result.

Cosmos DB data

    {
      "id": "ED-BSC1",
      "data": [
        {
          "date": "2023-05-09",
          "employees": [
            {
              "employee_id": "E3948",
              "full_name": "Donni",
              "loc": "Building F",
              "floor": 5
            },
            {
              "employee_id": "E9372",
              "full_name": "Viona",
              "loc": "Building F",
              "floor": 5
            },
            {
              "employee_id": "E4551",
              "full_name": "Abby",
              "loc": "Building A",
              "floor": 6
            }
          ]
        },
        {
          "date": "2023-05-10",
          "employees": [
            {
              "employee_id": "E9032",
              "full_name": "Brian",
              "loc": "Building A",
              "floor": 12
            },
            {
              "employee_id": "E9033",
              "full_name": "Emma",
              "loc": "Building A",
              "floor": 11
            },
            {
              "employee_id": "E9034",
              "full_name": "Emily",
              "loc": "Building A",
              "floor": 11
            }
          ]
        },
        {
          "date": "2023-05-11",
          "employees": [
            {
              "employee_id": "E1042",
              "full_name": "Diana",
              "loc": "Building C",
              "floor": 14
            },
            {
              "employee_id": "E5323",
              "full_name": "Dereck",
              "loc": "Building A",
              "floor": 11
            },
            {
              "employee_id": "E9034",
              "full_name": "Emily",
              "loc": "Building B",
              "floor": 19
            }
          ]
        }
      ]
    }

I'm using this query to achieve that and it's failed. It still diplays the whole data

    SELECT
        c.id,
        c.data
    from 
        c 
        join d in c.data 
        join e in d.employees   
    where 
        c.id = 'ED-BSC1' 
        and d.date >= '2023-05-10' 
        and d.date <= '2023-05-11' 
        and e.full_name like '%Em%'

Here's the expected result from the query above

    {
      "id": "ED-BSC1",
      "data": [
        {
          "date": "2023-05-10",
          "employees": [
            {
              "employee_id": "E9033",
              "full_name": "Emma",
              "loc": "Building A",
              "floor": 11
            },
            {
              "employee_id": "E9034",
              "full_name": "Emily",
              "loc": "Building A",
              "floor": 11
            }
          ]
        },
        {
          "date": "2023-05-11",
          "employees": [
            {
              "employee_id": "E9034",
              "full_name": "Emily",
              "loc": "Building B",
              "floor": 19
            }
          ]
        }
      ]
    }

Solution

  • I want to filter the data based on between data.date and contains employee.full_name. The following query give undesired result.

    • To get the required solution, I have used array function in nested queries. To know more about array function please refer the document here.
    • In the first query, I filtered with name and then filtered the output of the first query with dates, as mentioned in the above requirement.
    • Below is the Query:
    SELECT c.id,
    ARRAY(
            SELECT VALUE 
            {
                "date": d.date,
                "employees": ARRAY(               
                    SELECT VALUE e FROM e IN d.employees                
                    WHERE CONTAINS(e.full_name, "Em")
                )
            }
            
            FROM d IN c.data WHERE d.date >= '2023-05-10' AND d.date <= '2023-05-11'          
            
        ) AS data FROM c WHERE c.id = 'ED-BSC1'
    

    Below is the desired result :

    [
        {
            "id": "ED-BSC1",
            "data": [
                {
                    "date": "2023-05-10",
                    "employees": [
                        {
                            "employee_id": "E9033",
                            "full_name": "Emma",
                            "loc": "Building A",
                            "floor": 11
                        },
                        {
                            "employee_id": "E9034",
                            "full_name": "Emily",
                            "loc": "Building A",
                            "floor": 11
                        }
                    ]
                },
                {
                    "date": "2023-05-11",
                    "employees": [
                        {
                            "employee_id": "E9034",
                            "full_name": "Emily",
                            "loc": "Building B",
                            "floor": 19
                        }
                    ]
                }
            ]
        }
    ]
    
    • Here is the screenshot of the query run along with output in Cosmos DB.