Search code examples
azureazure-cosmosdbazure-cosmosdb-sqlapi

How can I query cosmosdb/sql where two properties change over time


I have a database that has documents with a Version property and Tier property

example:

{
    partitionKey: "AEDlCE4k4Xl/hv6o7ZNtSX9B5XjtfOKxYDCgGbQB518=",
    id: "1055984d-01bd-4fbf-877f-0c18ce91552e",
    Version: 1,
    Tier: "basic"
},
{
    partitionKey: "ABClCE4k4Xl/hv6o6ZNtSX9B5XjtfOKxYDCgGbQB424=",
    id: "1055984d-01bd-4fbf-877f-0c18ce91552e",
    Version: 2,
    Tier: "standard"
}

I want to be able to query my documents to find which ids have a Version=1 and Tier=basic and also have a Version=2 and Tier=standard.

I've tried to write a query like this:

select c.id from c
where (c.Version = 1 and c.TierId = "basic")
AND (c.Version = 2 and c.TierId = "standard")

I don't get any results back though. I think it's looking for a specific doc that has both version 1 and version 2. Is there a way I can update my query where it doesn't use the AND?

EDIT: I wanted to also mention that I have a separate partitionkey that is not the id. Updated my example docs above


Solution

  • The below gets kind of close (If Condition1Met and Condition2Met are both 1 then documents meeting both of the conditions were present for an id)

    SELECT c.id, 
           MAX(c.Version = 1 AND c.Tier = "basic" ? 1 : 0) AS Condition1Met, 
           MAX(c.Version = 2 AND c.Tier = "standard" ? 1 : 0) AS Condition2Met
    FROM c
    where (c.Version = 1 AND c.Tier = "basic")
    OR (c.Version = 2 AND c.Tier = "standard")
    GROUP BY c.id 
    

    But in practice unless the collection and result set are small this is unlikely to be suitable as "You can't use continuation tokens for queries with GROUP BY".

    (maybe you could use a home made keyset pagination approach though and tack on an ORDER BY c.id and add a condition to the WHERE that the next batch should be > than the highest seen in the previous batch)

    I think the best options are.

    1. Write a query with the WHERE clause with all OR-ed conditions, ordered by id, and projecting the id, Version, and Tier properties. And have client code process this and keep track whilst iterating through this for each id seen which condition(s) they matched - and use this to find the id matching both conditions.
    2. See if you are in a position to use the analytical store for this collection. You can then write pretty much any type of cross document query against that. (Most likely current blockers to this would probably be if you use continuous backup or if the latency of the synch to this was not acceptable)