Search code examples
azureazure-cosmosdb

How to query against array in Cosmos using sql?


This is my data where I want to get stockNumber if operationId in lineItems is occurring more than once.

i.e., as an output, my query should return 2002762240 out of these 2 records because it has same operationId twice.

Doc-1

{
  "stockNumber": "2002762240",
  "lineItems": [
    {
      "operationId": 8424759,
      "partLineId": null
    },
    {
      "operationId": 8424759,
      "partLineId": 0
    }
  ],
  "paymentItems": [
    {
      "subletDetails": [
        {
          "operationId": 8424759
        }
      ]
    }
  ]
}

Doc-2

{
  "stockNumber": "2002762241",
  "lineItems": [
    {
      "operationId": 8424760,
      "partLineId": 0
    }
  ],
  "paymentItems": [
    {
      "subletDetails": [
        {
          "operationId": 8424760
        }
      ]
    }
  ]
}

Approach-1: This gives me records which also exists in paymentItems. I only care for lineItems

SELECT top 100
  c.stockNumber AS stockNumber
FROM c
WHERE ARRAY_LENGTH(ARRAY(SELECT VALUE l.operationId FROM l IN c.lineItems GROUP BY l.operationId)) > 1

Approach-2: Throws BadRequest error

SELECT DISTINCT c.stockNumber 
FROM c JOIN l IN c.lineItems 
GROUP BY l.operationId 
HAVING COUNT(1) > 1

Approach-3: Throws BadRequest error

SELECT top 100
  c.stockNumber AS stockNumber
FROM c
WHERE ARRAY_LENGTH(c.lineItems) > 0
and EXISTS (
    select l.operationId from l in c.lineItems GROUP BY l.operationId 
HAVING COUNT(1) > 1
)

What am I missing here?


Solution

  • The error was caused by using HAVING in the query. Cosmos doesn't support the HAVING clause, so it is better to use the WHERE clause, as shown in the code below.

    Use the following query to get stockNumber where operationId in lineItems occurs more than once.

    SELECT 
        result.stockNumber
    FROM (
        SELECT 
            c.stockNumber AS stockNumber,
            l.operationId AS operationId,
            COUNT(1) AS operationIdCount
        FROM c
        JOIN l IN c.lineItems
        GROUP BY c.stockNumber, l.operationId
    ) AS result
    WHERE result.operationIdCount > 1
    

    Output:

    [
        {
            "stockNumber": "2002762240"
        }
    ]