Search code examples
sqloracle-databasewindow-functions

How to compare the current row to the rest of the rows in an analytical SQL


SQL - Question with Oracle 19c:

I run over a dataset with an analytical function (with partition by, without ordering). For each row, I would like to compare the current row with the rest of the rows withing the group and in case I have a match, I would for example count how many matches I have.

I would expect something like this (how often is the same data avaiable in the group):

sum(case when current_row.dt = row.dt then 1 else 0 end) over (partition by x)

So my question is:

Can I compare the current row in the data set to each row in the group with referencing the current row value against each other row in the group?

I can join the data to each row and then run the comparison. But as I need many such comparisons, this will end up in many joins. So I'm expecting, that the functionality should be available out of the box. And yes, I could write my own aggregate function which would do the trick as well but as said, that should be part of the SQL standard?

A simple example would be:

drop table t;
CREATE TABLE t
(
    name  VARCHAR2(10)
  , num1   NUMBER
 , num2 number 
);

INSERT INTO t SELECT 'Hans',   1, 3 FROM DUAL;
INSERT INTO t SELECT 'Werner', 2, 5 FROM DUAL;
INSERT INTO t SELECT 'John',   1, 7 FROM DUAL;
INSERT INTO t SELECT 'Paul',   3, 4 FROM DUAL;
INSERT INTO t SELECT 'Fritz',  3, 3 FROM DUAL;

COMMIT;
SELECT t1.name
     , t1.num1
     , t1.num2
     , cnt
  FROM t t1
       CROSS JOIN LATERAL (
                              SELECT COUNT(1)     AS cnt
                                FROM t t2
                               WHERE t2.num1 <= t1.num1
                                     AND t2.num2 >= t1.num1
                          );

Solution

  • I would expect something like this (how often is the same data avaiable in the group):

    sum(case when current_row.dt = row.dt then 1 else 0 end) over (partition by x)
    

    Don't use a CASE expression, just include dt in the PARTITION BY clause:

    SUM(1) OVER (PARTITION BY x, dt)
    

    or:

    COUNT(*) OVER (PARTITION BY x, dt)
    

    You cannot compare rows inside an analytic function; you will need to use a self-JOIN to compare rows.

    But as I need many such comparisons, this will end up in many joins.

    If your problem is wanting to perform multiple aggregations then you can use a single LATERAL join and then use conditional aggregation so you only need to join once (rather than using one join for each calculation):

    SELECT t1.name
         , t1.num1
         , t1.num2
         , t2.cnt1
         , t2.cnt2
         , t2.cnt3
    FROM   t t1
           CROSS JOIN LATERAL (
             SELECT COUNT(CASE WHEN t1.num1 BETWEEN t2.num1 AND t2.num2 THEN 1 END) AS cnt1
                  , COUNT(CASE WHEN t1.num1 >= t2.num1 THEN 1 END) AS cnt2
                  , COUNT(CASE WHEN t1.num1 <= t2.num2 THEN 1 END) AS cnt3
             FROM   t t2
           ) t2;