node.js azure azure-cosmosdb gremlin azure-cosmosdb-gremlinapi

Understanding "x-ms-request-charge" and "x-ms-total-request-charge" in CosmosDB Gremlin API

I am using gremlin (version 3.4.6) package to query my Cosmos DB account targeting Gremlin (Graph) API. The code is fairly straightforward:

const gremlin = require('gremlin');

const authenticator = new gremlin.driver.auth.PlainTextSaslAuthenticator(
`/dbs/<database-name>/colls/<container-name>`,
"<my-account-key>"
);
const client = new gremlin.driver.Client(
    "wss://<account-name>.gremlin.cosmosdb.azure.com:443/",
    {
        authenticator,
        traversalsource : "g",
        rejectUnauthorized : true,
        mimeType : "application/vnd.gremlin-v2.0+json"
    }
);

client.submit("g.V()")
.then((result) => {
    console.log(result);
})
.catch((error) => {
    console.log(error);
});

The code is working perfectly fine and I am getting the result back. The result object has an attributes property which looks something like this:

{
    "x-ms-status-code": 200,
    "x-ms-request-charge": 0,
    "x-ms-total-request-charge": 123.85999999999989,
    "x-ms-server-time-ms": 0.0419,
    "x-ms-total-server-time-ms": 129.73709999999994,
    "x-ms-activity-id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
}

If you notice, there are two things related to request charge (basically how expensive my query is): x-ms-request-charge and x-ms-total-request-charge.

I have three questions regarding this:

What's the difference between the two?
I noticed that x-ms-request-charge is coming always as 0 and x-ms-total-request-charge as a non-zero value. Why is that? and
Which value should I use to calculate the request charge? My guess is to use x-ms-total-request-charge as it is a non-zero value.

And while we're at it, I would appreciate if someone can tell me the difference between x-ms-server-time-ms and x-ms-total-server-time-ms as well.

Solution

These response codes are specific to our Gremlin API and are documented here, Azure Cosmos DB Gremlin server response headers.

For a single request, Gremlin server can send response with multiple partial response messages (loosely equivalent to a page, but returned as a stream instead of multiple request/responses with continuations as is done with SQL API).

x-ms-request-charge is the RUs consumed to resolve a single partial response.
x-ms-total-request-charge is running total RUs consumed up to the current partial response. So when the final message is sent, this will denote the total RUs consumed for the entire request.

Depending on the Gremlin client driver implementation, each partial responses may be exposed to the caller OR the driver will accumulate all responses internally and return a final result. Given the latter, this prompted us to add the x-ms-total-request-charge, so that drivers implemented this way could still resolve the total cost of the request.

Thanks for the question and hope this is helpful.