Search code examples
sqlsql-serverlagrow-number

Expand on limit SQL query results use lag() window function


I have an table with columns id, score, parent_id ordered by the score. I have ask one scenario in here.

The different from previous question is that the parent_id might show up on multiple rows not necessary sequence rows. The updated table:

id score parent_id
5859 10 5859
2157043 9 5859
21064154 8 21064154
51992 7 51992
34384599 6 51992
1675761 5 5859
3465729 4 3465729
401202 3 401203
1817458 2 1817458

I want to query all columns from this table with the same order but limit results at least 5 rows to meet the unique parent_id number equal to 5. As result, the parent_id only contains 5 ids: 5859, 21064154, 51992, 3465729, 401203

Expected Results like:

id score parent_id
5859 10 5859
2157043 9 5859
21064154 8 21064154
51992 7 51992
34384599 6 51992
1675761 5 5859
3465729 4 3465729
401202 3 401203

The solution using lag only works for the following row has same value of parent_id. If use Java, we could use a SET to store the parent_id and keep count the unique parent_id, but how do we write in SQL?

select id, score, parent_id
from (
  select *, Sum(diff) over(order by score desc)seq
  from (
     select *, 
       case when Lag(parent_id) over(order by score desc) = parent_id then 0 else 1 end diff
    from t
  )t
)d
where seq <= 5
order by score desc;

Solution

  • Perhaps this will work for you, a refactor of your existing query but an alternative to lag using exists which caters for non-sequential rows, give this a try:

    select id, score, parent_id
    from (
      select *, Sum(keep) over (order by score desc) seq
      from (
        select *, 
          case when exists (
            select * from t t2 
            where t2.parent_id = t.parent_id and t2.score > t.score
          ) then 0 else 1 end keep
          from t
      )t
    )s
    where seq <= 5
    order by score desc;