Search code examples
sqlmysqlinner-joingreatest-n-per-group

How can I count the number of records with a weight property


I have these two tables:

    CREATE TABLE `papers` (
      `id` int(11) DEFAULT NULL,
      `Paperti` varchar(255) DEFAULT NULL,
      `Tow` varchar(255) DEFAULT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8;

    INSERT INTO `papers` VALUES ('1', 'ti1', 'Article');
    INSERT INTO `papers` VALUES ('2', 'ti2', 'Conference');
    INSERT INTO `papers` VALUES ('3', 'ti3', 'Letter');
    INSERT INTO `papers` VALUES ('4', 'ti4', 'Article');
    INSERT INTO `papers` VALUES ('5', 'ti4', 'Letter');

    CREATE TABLE `typeofwwork` (
      `rec` int(11) DEFAULT NULL,
      `tow` varchar(255) DEFAULT NULL,
      `weight` int(11) DEFAULT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
    
    INSERT INTO `typeofwwork` VALUES ('1', 'Letter', '5');
    INSERT INTO `typeofwwork` VALUES ('2', 'Article', '1');
    INSERT INTO `typeofwwork` VALUES ('3', 'Conference', '4');

Now you see I have a weight of the typeofworks. I would like to count the numbers of each type of works for every paper, but when I have counted a lower weighted typeofwork for that paper, I should not count the higher weighted typeofwork.

I tried this but that is not correct.

    SELECT Paperti,a.tow,min(b.weight),count(*) from papers a JOIN typeofwork b on a.Tow=b.tow  GROUP BY a.Paperti;

The desired result should be:

PaperTI  Count  ToW
 ti1      1    Article
 ti2      1    Conference
 ti3      1    Letter
 ti4      1    Article       ... because Article-weight is lower than Letter-weight 

and this should be the second result:

          2    Article
          1    Conference
          1    Letter

How could I do that?


Solution

  • We can join the two tables and prioritize the type of work of each paper with row_number. The rest is just aggregation:

    select tow, count(*) cnt
    from (
        select p.tow,
            row_number() over(partition by p.paperti order by t.weight, t.rec) rn
        from papers p
        inner join typeofwwork t on t.tow = p.tow
    ) p
    where rn = 1
    group by tow
    

    Note the additional sort criteria on t.rec in the windowing clause: it is there to break potential weight ties in a deterministic way.