SQL - Question with Oracle 19c:
I run over a dataset with an analytical function (with partition by, without ordering). For each row, I would like to compare the current row with the rest of the rows withing the group and in case I have a match, I would for example count how many matches I have.
I would expect something like this (how often is the same data avaiable in the group):
sum(case when current_row.dt = row.dt then 1 else 0 end) over (partition by x)
So my question is:
Can I compare the current row in the data set to each row in the group with referencing the current row value against each other row in the group?
I can join the data to each row and then run the comparison. But as I need many such comparisons, this will end up in many joins. So I'm expecting, that the functionality should be available out of the box. And yes, I could write my own aggregate function which would do the trick as well but as said, that should be part of the SQL standard?
A simple example would be:
drop table t;
CREATE TABLE t
(
name VARCHAR2(10)
, num1 NUMBER
, num2 number
);
INSERT INTO t SELECT 'Hans', 1, 3 FROM DUAL;
INSERT INTO t SELECT 'Werner', 2, 5 FROM DUAL;
INSERT INTO t SELECT 'John', 1, 7 FROM DUAL;
INSERT INTO t SELECT 'Paul', 3, 4 FROM DUAL;
INSERT INTO t SELECT 'Fritz', 3, 3 FROM DUAL;
COMMIT;
SELECT t1.name
, t1.num1
, t1.num2
, cnt
FROM t t1
CROSS JOIN LATERAL (
SELECT COUNT(1) AS cnt
FROM t t2
WHERE t2.num1 <= t1.num1
AND t2.num2 >= t1.num1
);
I would expect something like this (how often is the same data avaiable in the group):
sum(case when current_row.dt = row.dt then 1 else 0 end) over (partition by x)
Don't use a CASE
expression, just include dt
in the PARTITION BY
clause:
SUM(1) OVER (PARTITION BY x, dt)
or:
COUNT(*) OVER (PARTITION BY x, dt)
You cannot compare rows inside an analytic function; you will need to use a self-JOIN
to compare rows.
But as I need many such comparisons, this will end up in many joins.
If your problem is wanting to perform multiple aggregations then you can use a single LATERAL
join and then use conditional aggregation so you only need to join once (rather than using one join for each calculation):
SELECT t1.name
, t1.num1
, t1.num2
, t2.cnt1
, t2.cnt2
, t2.cnt3
FROM t t1
CROSS JOIN LATERAL (
SELECT COUNT(CASE WHEN t1.num1 BETWEEN t2.num1 AND t2.num2 THEN 1 END) AS cnt1
, COUNT(CASE WHEN t1.num1 >= t2.num1 THEN 1 END) AS cnt2
, COUNT(CASE WHEN t1.num1 <= t2.num2 THEN 1 END) AS cnt3
FROM t t2
) t2;