Search code examples
sqloracle-databasegreatest-n-per-group

How to show the most recent value for each row


This is my current query. I used an inline view to only show the top 5 results for the companies with the highest volume of shares traded at a specific stock exchange. Alongside this, I want to show the most recent price for each row. This information is found in the stock_price table with a respective date.

SELECT 
*
FROM
    (
        SELECT
            c.name AS company_name,
            t.stock_id AS stock_id,
            SUM(t.shares) AS Trade_Volume
        FROM company c
            JOIN trade t
            ON c.stock_id = t.stock_id
            JOIN stock_price sp
            ON sp.stock_ex_id =3 
            AND sp.stock_id = t.stock_id
            WHERE t.stock_ex_id = 3
        GROUP BY
            c.name, t.stock_id
            ORDER BY SUM(t.shares) DESC
    )
WHERE
    ROWNUM <= 5;

enter image description here

My attempt:

SELECT 
c.name,
SUM(t.shares),
sp.price
FROM 
company c 
JOIN trade t 
ON c.stock_id = t.stock_id
JOIN stock_price sp
ON sp.stock_id = t.stock_id
AND sp.stock_ex_id = t.stock_ex_id
WHERE sp.stock_ex_id =3
GROUP BY c.name, sp.price
;

The output:

enter image description here


Solution

  • You could do something like this (not tested; explained after the query):

    with
      high_vols as (
        select *
        from   (
                 select c.name as company_name, c.stock_id as stock_id,
                        sum(t.shares) as trade_volume
                 from   company c join trade t on c.stock_id = t.stock_id
                 where  t.stock_ex_id = 3
                 group  by c_name, c.stock_id
                 order  by sum(t.shares) desc
               )
        where  rownum <= 5
      )
    , last_prices as (
        select stock_id, 
               min(price) keep (dense_rank last order by price_date) as last_price
        from   stock_price
        where  stock_id in (select stock_id from high_vols) and stock_ex_id = 3
        group  by stock_id
      )
    select hv.company_name, hv.stock_id, hv.total_volume, lp.last_price
    from   high_vols hv join last_prices lp on hv.stock_id = lp.stock_id
    order  by total_volume desc
    ;
    

    HIGH_VOLS is the query you have already. It finds the five stocks with the highest volume on a given exchange - and on a given date if you will add that in the query; you didn't do that in your query, and you didn't describe your data in enough detail to attempt to add that in the query, but you should be able to do that yourself. Note that the same observation (about date) applies to the second query in the WITH clause.

    LAST_PRICES selects prices from the price table, filtered for the exchange, for the five stocks that were found in the first subquery, and for the desired date (after you add that in). It uses the aggregate LAST function - a very useful aggregate function for this kind of query, unfortunately ignored by many developers.

    The last step is to join the two tables. It is best to join the tables only after the two aggregations. (In any case, if statistics are up-to-date, the optimizer will find the best order of operations regardless of how you write the query, as long as it is written correctly. In fact, the optimizer will probably choose to do the "top-five" aggregation just on the "trades" table, and join to "company" only at the very last step.)