I'm trying to find the 95th percentile (and the highest buy) of the item-price using the order in my ~300k row table.
I've been successful in finding the 95th percentile and the highest buy for one single item with this code:
SELECT type_id,
Max(price) AS buy,
Min(price) AS '95th% buy'
FROM (SELECT *,
( Row_number()
OVER (
partition BY type_id
ORDER BY price DESC) ) AS rownr
FROM orderbuffertest AS rownr
WHERE is_buy_order = 1
ORDER BY ( Row_number()
OVER (
partition BY type_id
ORDER BY price DESC) ) ASC) AS t1
WHERE t1.type_id = 44992
AND t1.rownr < (SELECT Count(*)
FROM orderbuffertest
WHERE is_buy_order = 1
AND type_id = 44992) * 0.05;
However, now I'm trying to GROUP BY type_id
and it's messing up all my values.
Does anybody have an idea of how to GROUP BY type_id
this query? Maybe even ways to improve the original one?
I thank you in advance,
TheJozzle
Ps. Here's a link to my database, if you'd like to mess/test around with it: https://gofile.io/?c=Ga6ODr
This query should give you the results you want. It allocates a ROW_NUMBER
by price
as well as counting all rows for each type_id
and order type (is_buy_order
) in a CTE, then selects the MAX
price as the buy
price (for is_buy_order = 1
), and the minumum price for rows >= the 95th percentile as the 95th percentile price. In the event that there are no rows in the 95th percentile other than the highest price, the second highest price is returned. Similar logic applies to the generation of the sell
and 95th%sell
prices:
WITH prices AS (
SELECT type_id, price, is_buy_order,
ROW_NUMBER() OVER (PARTITION BY type_id, is_buy_order ORDER BY price DESC) AS rownr,
COUNT(*) OVER (PARTITION BY type_id, is_buy_order) AS num_rows
FROM orderbuffertest
)
SELECT type_id,
MAX(CASE WHEN is_buy_order = 1 THEN price END) AS buy,
COALESCE(MIN(CASE WHEN is_buy_order = 1 AND 100.0 * (rownr - 1) / num_rows <= 5 AND rownr != 1 THEN price END),
MAX(CASE WHEN is_buy_order = 1 AND rownr = 2 THEN price END)) AS `95th%buy`,
MIN(CASE WHEN is_buy_order = 0 THEN price END) AS sell,
COALESCE(MAX(CASE WHEN is_buy_order = 0 AND 100.0 * rownr / num_rows >= 95 AND rownr != num_rows THEN price END),
MAX(CASE WHEN is_buy_order = 0 AND rownr = num_rows - 1 THEN price END)) AS `95th%sell`
FROM prices
GROUP BY type_id
If you can't use CTEs for some reason, you could write the CTE as a subquery:
SELECT type_id,
MAX(CASE WHEN is_buy_order = 1 THEN price END) AS buy,
COALESCE(MIN(CASE WHEN is_buy_order = 1 AND 100.0 * (rownr - 1) / num_rows <= 5 AND rownr != 1 THEN price END),
MAX(CASE WHEN is_buy_order = 1 AND rownr = 2 THEN price END)) AS `95th%buy`,
MIN(CASE WHEN is_buy_order = 0 THEN price END) AS sell,
COALESCE(MAX(CASE WHEN is_buy_order = 0 AND 100.0 * rownr / num_rows >= 95 AND rownr != num_rows THEN price END),
MAX(CASE WHEN is_buy_order = 0 AND rownr = num_rows - 1 THEN price END)) AS `95th%sell`
FROM (
SELECT type_id, price, is_buy_order,
ROW_NUMBER() OVER (PARTITION BY type_id, is_buy_order ORDER BY price DESC) AS rownr,
COUNT(*) OVER (PARTITION BY type_id, is_buy_order) AS num_rows
FROM orderbuffertest
) prices
GROUP BY type_id