Search code examples
azureazure-cosmosdbazure-cosmosdb-sqlapiconsistency

Can I use a client constructed session token for cosmosdb?


I've done a bit of research on using session token in cosmosdb's dotnet v3 sdk, and so far I found these two links that give me some hints on how to use it: utilize-session-tokens and how-to-convert-session-token.

In our scenario, we would like to have strong consistency (but we don't want to use strong consistency for all data) if the update belongs to the same userId, so that when one instance updates data under this user, everyone else will immediately see the result. We'd also want to use cosmosdb as a lock for another scenario.

However, the links above only show how to reuse a token that was returned from creating a document. And I wonder if I can construct my own session token and use it for strong consistency.

For example, if I want to update the data under a specific userId I would use {userId}:-1#1 as the session token. Would that be a valid way to use the session token? I'm also not sure what the fields pkrangeid, Version, GlobalLSN mean and what role they play when cosmosdb deals with consistency.

Thanks in advance!


Solution

  • The session token includes the LSN that cannot be created by the client. Session token must be issued by the service as it is the only way to provide the consistency guarantees for Session consistency.

    To get in-region strong consistency, or more accurately, read your own write guarantees provided by Session consistency and are using a single instance of the Cosmos client you will already get read your own write guarantees. You do not need to manage the Session token. The Cosmos SDK does it for you.

    If you have a scenario where you have multiple instances of the Cosmos DB client in separate processes and your want read your own write guarantees, you have two options.

    1. Implement bounded staleness consistency which provides in-region strong consistency for all Cosmos client instances reading and writing in that region. It does this by reading two replicas. Since Cosmos DB always writes to 3 replicas, you are guaranteed to always read the most current data as Cosmos DB will check the LSN from both replicas and return the data from the higher LSN if they don't match. The upside for this approach is that it is super easy to implement. The downside is that point reads (i.e. ReadItemAsync()) cost twice as much because its reading from two replicas.

    2. Use Session consistency and use Stateful Entities in Durable Functions or something similar that will allow you to implement a distributed mutex to store and update the Session token across multiple Cosmos client instances. Upside here is point reads are still 1 RU, downside is this complexity and also all writes are serialized as they require queuing all writes up against the mutex which needs to be updated by each client instance. Note: if your clients are in the same process but on multiple threads, you could use a Concurrent Collection which is simpler but still requires synchonizing threads so will impact write throughput in the client when there is high concurrency.