Search code examples
sqlknex.js

SQL Query: Filter duplicate entries by getting the entries with the maximum id number then join another table's data with it


I am trying to get all id, std_name from table1, and all id, score from table2 where std_id of table2 matches the id of table1 and deleted_at should be null for all entries of table1 and table2. But table2 can have duplicate std_ids, in that case, I only want the entries with the maximum id number from table2.

Sample table1:

id std_name deleted_at
1 jhon null
2 sam null
3 joe null

Sample table2:

id std_id score deleted_at
1 1 10 null
2 2 20 null
3 1 30 null

So far I have tried using this query:

const query = knex.select([
    't1.id as t1id',
    't1.std_name as name',
    't2.score as score'
])
.from('table1 as t1')
.leftJoin('table2 as t2', function () {
    this.on('t2.std_id', '=', 't1.id')
})
.joinRaw('left join (select MAX(id) as id, std_id from table2 group by std_id) as kst on kst.std_id = t2.std_id');

query.where({'t1.deleted_at': null}).orderBy('t1.id')

Results generated for the above query:

id name score
1 jhon 30
2 sam 20

But this only returns the maximum id entry of the duplicate entries from table2 and omits the entries of table1, but I also want the ids from table1 which are not included in the std_id of table2.

My desired output:

id name score
1 jhon 30
2 sam 20
3 joe null

Solution

  • You can use window functions. In SQL this looks like:

    select t1.*, t2.score
    from table1 t1 left join
         (select t2.*,
                 row_number() over (partition by t2.std_id order by t2.id desc) as seqnum
          from table2 t2
         ) t2
         on t2.std_id = t1.id and t2.seqnum = 1;