Search code examples
sqloracle-databasegroup-bycountdatabase-partitioning

Using Over Partition in sql compared with group by


Given the table creation code bellow, is there an alternative method(s) to display the same result to

select b.*, count(*) over (partition by colour) bricks_total 
from bricks b;

using group by and count(*)? And what's the difference in this case?

create table bricks 
(
     brick_id integer,
     colour   varchar2(10),
     shape    varchar2(10),
     weight   integer
);

insert into bricks values (1, 'blue', 'cube', 1);
insert into bricks values (2, 'blue', 'pyramid', 2);
insert into bricks values (3, 'red', 'cube', 1);
insert into bricks values (4, 'red', 'cube', 2);
insert into bricks values (5, 'red', 'pyramid', 3);
insert into bricks values (6, 'green', 'pyramid', 1);

commit;

Solution

  • This query puts the total for each colour on each row:

    select b.*, count(*) over (partition by colour) as bricks_total
    from bricks b;
    

    Before window functions, a typical solution would be a correlated subquery:

    select b.*,
           (select count(*) from bricks b2 where b2.colour = b.colour) as bricks_total
    from bricks b;
    

    You can also express this using join and aggregation:

    select b.*, bb.bricks_total
    from bricks b join
         (select bb.colour, count(*) as bricks_total
          from bricks bb
          group by bb.colour
         ) bb
         using (colour);
    

    These are not 100% the same. The difference is that the original code will return the count of colour even when the value is NULL. This code returns 0.

    So, a more precise equivalent would be:

    select b.*,
           (select count(*)
            from bricks b2
            where b2.colour = b.colour or
                  b2.colour is null and b.colour is null
           ) as bricks_total
    from bricks b;