Search code examples
sqldatabasecouchbasesql++

How do you extract the maximum and minimum value from a nested array in N1QL query?


Background

I'm am running the following query (I know it may be hard to read, so I'll clarify what it's doing afterwards):

SELECT *
FROM `my_bucket` AS a
    NEST (
    SELECT c.toNode, c.fromNode, d.endDateTime, d.startDateTime
    FROM `my_bucket` AS c
        JOIN (
        SELECT id, customAttributes.endDateTime, customAttributes.startDateTime
        FROM `my_bucket`
        WHERE type='airNode'
            AND customAttributes.endDateTime >= CLOCK_UTC()
            AND customAttributes.endDateTime <= DATE_ADD_STR(CLOCK_UTC(), 14, 'day')) AS d ON c.toNode = d.id
        AND c.type='relNode') b ON a.id = b.fromNode AND a.type='edNode';

Essentially, what's happening here is that we're pulling the relevant airNodes from the database, and joining them with the relNodes where relNodes.toNode = airNodes.id (toNode and id are fields from their respective nodes). Then, we're joining these relAirNodes with edNodes, and the statement returns with the following structure:

[ { 'edNode':..., 'relAirNodes':[{...},{...}] },...]

So essentially, we're getting an array of pairs, where each pair consists of one edNode and an array of relAirNodes.

Question

Each relAirNode in the arrays of relAirNodes has an endTime and startTime field. Is there a way to run the query such that the return result is an array of pairs, the first element of the pair being an editNode and the second element of the pair being a single struct with the minimum startTime of the array of relAirNodes and the maximum endTime of the array of relAirNodes? In other words, the return object would look like:

[ { 'edNode':..., 'relAirNodes':{'startTime':..., 'endTime':...}}, ...]

Solution

  • Couchbase JOINS are LEFT to RIGHT. If you are looking only INNER JOINS only re-arrange JOINs such that it eliminate more early.

    Create indexes described here https://blog.couchbase.com/ansi-join-support-n1ql/

    SELECT en.*, c AS relAirNodes
    FROM (SELECT rn.fromNode, MIN(b.startDateTime) AS startDateTime, MAX(b.endDateTime) AS endDateTime
          FROM (SELECT an.id, an.customAttributes.endDateTime, an.customAttributes.startDateTime
                FROM my_bucket AS an
                WHERE an.type = "airNode"
                      AND an.customAttributes.endDateTime BETWEEN CLOCK_UTC() AND DATE_ADD_STR(CLOCK_UTC(), 14, "day")
                ) AS b
          JOIN my_bucket AS rn ON rn.type = "relNode" AND rn.toNode = b.id
          GROUP BY rn.fromNode) AS c
    JOIN my_bucket AS en ON en.type = "edNode" AND en.id = c.fromNode
    ;
    
    CREATE INDEX ix1 ON my_bucket(customAttributes.endDateTime, customAttributes.startDateTime, id) WHERE type = "airNode";
    CREATE INDEX ix2 ON my_bucket(toNode, fromNode) WHERE type = "relNode";
    CREATE INDEX ix3 ON my_bucket(id) WHERE type = "edNode";
    

    If need Outer Joins (Requires CB 6.5.0+ https://blog.couchbase.com/ansi-join-enhancements-and-ansi-merge/)

    SELECT en.*, c AS relAirNodes
    FROM my_bucket AS en
    LEFT JOIN (SELECT rn.fromNode, MIN(b.startDateTime) AS startDateTime, MAX(b.endDateTime) AS endDateTime
               FROM (SELECT an.id, an.customAttributes.endDateTime, an.customAttributes.startDateTime
                     FROM my_bucket AS an
                     WHERE an.type = "airNode"
                           AND an.customAttributes.endDateTime BETWEEN CLOCK_UTC() AND DATE_ADD_STR(CLOCK_UTC(), 14, "day")
                ) AS b
               JOIN my_bucket AS rn ON rn.type = "relNode" AND rn.toNode = b.id
               GROUP BY rn.fromNode) AS c ON en.id = c.fromNode
    WHERE en.type = "edNode";
    

    Also there are ARRAY_MIN(), ARRAY_MAX() functions. ex : ARRAY_MIN(relAirNodes[*].startDateTime) on array of objects.