oracle12c,sql,difference between count(*) and sum()

Tell me the difference between sql1 and sql2:

sql1:

select count(1)
from table_1 a
inner join table_2 b on a.key = b.key where a.id in (
  select id from table_1 group by id having count(1) > 1
)

sql2:

select sum(a) from (
  select count(1) as a
  from table_1 a
  inner join table_2 b on a.key = b.key group by a.id having count(1) > 1
)

Why is the output not the same?

Solution

The queries are not even similar. They are very different. Let's check the first one:

select count(1)
from table_1 a
inner join table_2 b
on a.key = b.key 
where a.id in (
  select id from table_1 group by id having count(1) > 1
) ;

You are first making an inner join:

select count(1) 
from table_1 a 
inner join table_2 b 
on a.key = b.key

In this case, you can use count(1), count(id), count(*), it's equivalent. You are counting the common elements in both tables: those ones that have in common the key field.

After that, you are enforcing this:

where a.id in (
      select id from table_1 group by id having count(1) > 1
    )

In other words, that every "id" of the table_1 must be at least two times in the table_1 table.

And lastly, you are doing this:

select count(1)

In other words, counting those elements. So, translated into english you have done this:

get every record of table_1 and pair with records of table_2 for the id, and get only those that match
for the result above, filter out only the elements whose id of the table_1 appears more than one time
count that result

Let's see what happens with the second query:

select sum(a) from (
  select count(1) as a
  from table_1 a
  inner join table_2 b 
  on a.key = b.key 
  group by a.id 
  having count(1) > 1
);

You are making the same inner join:

  select count(1) as a
  from table_1 a
  inner join table_2 b 
  on a.key = b.key

but, you are grouping it by the id of the table:

  group by a.id

and then filtering out only those elements who appear more than one time:

  having count(1) > 1

The result so far are a set of records that have in common the key field in both tables, but grouped by the id: this means that only those fields that are at leas two times in the table_b are outputed of this join. After that, you group by id, collapsing those results into the table_1.id field and counting the result. I presume that very few records will match this strict criteria.

And lastly, you sum all those set.