I have the following (simplified) situation in two databases:
ID Prog T Qt
|---------|--------|---------|---------|
| a | 1 | N | 100 |
| b | 1 | Y | 10 |
| b | 2 | N | 90 |
| c | 1 | N | 25 |
| c | 2 | Y | 25 |
| c | 3 | Y | 25 |
| c | 4 | Y | 25 |
|---------|--------|---------|---------|
ID Prog T Qt
|---------|--------|---------|---------|
| 1 | 1 | Y | 10 |
| 1 | 2 | N | 90 |
| 2 | 1 | Y | 100 |
| 3 | 1 | Y | 100 |
| 4 | 1 | Y | 50 |
| 4 | 2 | Y | 25 |
| 4 | 3 | Y | 25 |
|---------|--------|---------|---------|
I need to compare groups of rows (primary keys are ID and Prog), to find out which groups of rows represent the same combination of factors (not considering ID).
In the example above, ID "b" in the first table and ID "1" in the second have the same combination of values for Prog, T and Qt, while no one else can be considered exactly the same between the 2 dbs (while ID "2" and "3" in the second table are equal, I'm not interested in comparing in the same db).
I hope I explained everything.
A join and aggregation should work for this purpose:
select t1.id, t2.id
from (select t1.*, count(*) over (partition by id) as cnt
from t1
) t1 join
(select t2.*, count(*) over (partition by id) as cnt
from t2
) t2
on t1.prog = t2.prog and t1.T = t2.T and t1.Qt = t2.Qt and t1.cnt = t2.cnt
group by t1.id, t2.id, t1.cnt
having count(*) = t1.cnt;
This is a little tricky. The subqueries count the number of rows for each id in each table. The on
clause gets matches between the three columns -- and checks that the ids have the same count. The group by
and having
then get rows where number of matching rows is the total number of rows.