Search code examples
sqlcountsumoracle12c

oracle12c,sql,difference between count(*) and sum()


Tell me the difference between sql1 and sql2:

sql1:

select count(1)
from table_1 a
inner join table_2 b on a.key = b.key where a.id in (
  select id from table_1 group by id having count(1) > 1
) 

sql2:

select sum(a) from (
  select count(1) as a
  from table_1 a
  inner join table_2 b on a.key = b.key group by a.id having count(1) > 1
)

Why is the output not the same?


Solution

  • The queries are not even similar. They are very different. Let's check the first one:

    select count(1)
    from table_1 a
    inner join table_2 b
    on a.key = b.key 
    where a.id in (
      select id from table_1 group by id having count(1) > 1
    ) ;
    

    You are first making an inner join:

    select count(1) 
    from table_1 a 
    inner join table_2 b 
    on a.key = b.key
    

    In this case, you can use count(1), count(id), count(*), it's equivalent. You are counting the common elements in both tables: those ones that have in common the key field.

    After that, you are enforcing this:

    where a.id in (
          select id from table_1 group by id having count(1) > 1
        ) 
    

    In other words, that every "id" of the table_1 must be at least two times in the table_1 table.

    And lastly, you are doing this:

    select count(1)
    

    In other words, counting those elements. So, translated into english you have done this:

    1. get every record of table_1 and pair with records of table_2 for the id, and get only those that match
    2. for the result above, filter out only the elements whose id of the table_1 appears more than one time
    3. count that result

    Let's see what happens with the second query:

    select sum(a) from (
      select count(1) as a
      from table_1 a
      inner join table_2 b 
      on a.key = b.key 
      group by a.id 
      having count(1) > 1
    );
    

    You are making the same inner join:

      select count(1) as a
      from table_1 a
      inner join table_2 b 
      on a.key = b.key 
    

    but, you are grouping it by the id of the table:

      group by a.id 
    

    and then filtering out only those elements who appear more than one time:

      having count(1) > 1
    

    The result so far are a set of records that have in common the key field in both tables, but grouped by the id: this means that only those fields that are at leas two times in the table_b are outputed of this join. After that, you group by id, collapsing those results into the table_1.id field and counting the result. I presume that very few records will match this strict criteria.

    And lastly, you sum all those set.