Suppose one table t as following:
c1 c2 == == a 1 a 2 b 1 b 2 b 3 c 4 c 2
We group this table by c1, and it has three groups: a, b, c. I need to calculate the similarity of column c2 between two groups, like the following:
sim(a,b) = 2(common value of c2 are 1 and 2)/3(all value)=2/3 sim(b,c) = 1(b and c has only one value 2 in common)/4 = 1/4 sim(a,c) = 1/3
Can we use SQL (Oracle 11g syntax first) to construct the expression above?
I believe this query does what you want:
select t1.c1, t2.c1, count(*) as NumInCommon,
(select count(distinct t.c2)
from t
where t.c1 in (t1.c1, t2.c1)
) as NumInTotal
from t t1 join
t t2
on t1.c2 = t2.c2
group by t1.c1, t2.c1