Search code examples
azureazure-cosmosdbazure-cosmosdb-sqlapi

Query documents with array of array in Cosmos DB


I've documents in Cosmos DB with a following structure:

{
  "id": "pst_bf6i9GM0JTN8W1m66N8F",
  "postTitle": "Post 1",
  "memberships": [
      [
          {
              "chain": 1,
              "contractAddress": "0xcB3165Df24BAe1aB590616DE7a950Dd804f39A3C",
              "id": "mbs_IUTG9kJG84Gv2jr17Y8y"
          },
          {
            "chain": 1,
            "contractAddress": "0x00758fc71Cee3AC591E69fF9010c88782E678De2",
            "id": "mbs_jUTG9kJG83Gv2jr17Y8y"
          }
      ],
      [
          {
              "chain": 80001,
              "contractAddress": "0x358738C2Fa36bbb3e983Dc0CC17798E0ef0653f2",
              "id": "mbs_cUTG9kJGx3Gv2jr17Y8l"
          }
      ]
  ]
}

The memberships property is an array of array that in the app logic represents the requirements (AND/OR) for getting access to the post protected content (useless to give more details about that).

I'd like to write a query that can extract all posts that contains a paired list of contractAddress and chain values. How can I do it?

I tried in many ways, but the only one I was able to implement is creating a Stored Procedure and filter items using code.

To explain better I will write the query in pseudo-code (or better not correct code) since I don't how to do it properly (and even if it is possible):

SELECT c.id FROM c WHERE ARRAY_LENGTH(c.memberships) > 0
AND (ARRAY_CONTAINS(c.memberships, {contractAddress: "0x00758fc71Cee3AC591E69fF9010c88782E678De2", chain: 1}) OR
ARRAY_CONTAINS(c.memberships, {contractAddress: "0x358738C2Fa36bbb3e983Dc0CC17798E0ef0653f2", chain: 80001})

The expected output of this query is to extract exactly the document ID of the document presented before:

[
  {
    "id": "pst_bf6i9GM0JTN8W1m66N8F",
  }
]

Solution

  • You could use a self-join to get the ability to filter on the sub elements of memberships. Then add a boolean parameter to the ARRAY_CONTAINS to do partial matches, since you don't specify the id of the membership inside the ARRAY_CONTAINS. The self-join will have the side effect that it'll duplicate the result if there's multiple matches for the same document. That can be overcome by a DISTINCT in the SELECT:

    SELECT DISTINCT c.id 
    FROM c
    JOIN m IN c.memberships
    WHERE ARRAY_LENGTH(m) > 0
    AND (
        ARRAY_CONTAINS(m, {contractAddress: '0x00758fc71Cee3AC591E69fF9010c88782E678De2', chain: 1}, true) 
        OR ARRAY_CONTAINS(m, {contractAddress: '0x358738C2Fa36bbb3e983Dc0CC17798E0ef0653f2', chain: 80001}, true)
    )