Search code examples
sqlselectsubquerygreatest-n-per-group

SQL Return rows based on value of other rows


We use a database to capture transactions / deals and I need to find out which one is the most up-to-date deal.

  • When a new transaction is put in, the database column Action is set to "Insert" and the column Modified is set to "No".
  • If an update happens, 2 additional rows are written Action = "Delete"; Modified = "Yes" AND Action = "Insert"; Modified = "Yes".
  • If a transaction gets deleted another row is added: Action = "Delete"; Modified = "No"

Here are some examples:

  • Deal A was only inserted. Therefore, I need to return Record 1.
  • Deal B was inserted and updated. I would need the query to return Record 4.
  • Deal C was inserted and deleted. The query should not return deal C at all.
  • Deal D was inserted and updated twice. I need Record 11 to be returned.
  • Deal E was inserted, updated twice and then deleted. The query should not return deal E at all.

Sample Deals

Thanks, Daniel


Solution

  • If I followed you correctly, you can filter on the latest record per deal with a subquery, and then ignore deals whose last status is a deletion:

    select t.*
    from mytable t
    where 
        t.record = (select max(t1.record) from mytable t1 where t1.dealID = t.dealID)
        and t.action <> 'Delete'
    

    As far as concerns, you don't need to use column modified to get the result that you want.

    For performance with this query, consider an index on (dealID, record, action).

    You can also express this with row_number(), if your database supports window functions:

    select record, action, modified, dealID
    from (
        select t.*, row_number() over(partition by dealID order by record desc) rn
        from mytable t
    ) t
    where action <> 'Deleted'