I've been dealing with a problem of splitting a metric into several bands. To give you some context, let's take this example where we have certain number of orders per customer. Now, a customer may order n number of products. Let's give the customer certain discount based on the number of orders. The discounts are offered based a tiered model. I'm leaving out multiple product categories to keep it simple. Here are some examples of the tables.
Orders table
Customer | order_no
----------------------------
Customer1 | 400
Customer2 | 1200
Customer3 | 40
Customer4 | 2000
Customer5 | 700
Tiered pricing table
Tier | lower_th | higer_th | price |
--------------------------------------
Tier1 | 0 | 250 | 50 |
TIer2 | 251 | 500 | 45 |
Tier3 | 501 | 1000 | 40 |
TIer4 | 1001 | 10000 | 30 |
Example1: I want to be able to charge Customer1 $50 for 250 order and $45 for the rest of 150 products out of a total of 400.
Example2: I want to be able to charge Customer5 $50 for 250 order and $45 for another 250 and $40 for the rest 200 products out of a total of 700.
How do I achieve this in PostgreSQL? My output needs to be the following for Customer1. What's the best way to split the total number of orders and join it to the pricing tiers to get the corresponding amount?
Customer | order_no | charges |
--------------------------------
Customer1 | 250 | 50 |
Customer1 | 150 | 45 |
You can think of your tiers as intervals.
Two intervals [a1, b1]
and [a2, b2]
intersect when
a1 <= b2 AND b1 >= a2
The number of orders is another interval that always starts at 1.
Your two intervals are: Tiers [lower_th, higer_th]
and Orders [1, order_no]
.
The query is a simple join using this intersection expression:
SELECT *
,CASE WHEN O.order_no > T.higer_th
THEN T.higer_th - T.lower_th + 1 -- full tier
ELSE O.order_no - T.lower_th + 1
END AS SplitOrderNumbers
FROM
Orders AS O
INNER JOIN Tiers AS T
-- ON 1 <= T.higer_th AND O.order_no >= T.lower_th
ON O.order_no >= T.lower_th
ORDER BY
O.Customer
,T.lower_th
;
You don't really need the 1 <= T.higer_th
part, because it is always true, so the expression becomes simple O.order_no >= T.lower_th
.
Also, usually it is better to store intervals as [closed; open)
. It usually simplifies arithmetic, similar to why most programming languages have array indexes starting at 0, not 1. Your intervals seem to be [closed; closed]
. In this case you need to set lower_th
to 1
, not 0
and have +1
in the calculations.
With this adjustment of the sample data this query produces the following result:
+-----------+----------+-------+----------+----------+-------+-------------------+
| Customer | order_no | Tier | lower_th | higer_th | price | SplitOrderNumbers |
+-----------+----------+-------+----------+----------+-------+-------------------+
| Customer1 | 400 | Tier1 | 1 | 250 | 50.00 | 250 |
| Customer1 | 400 | Tier2 | 251 | 500 | 45.00 | 150 |
| Customer2 | 1200 | Tier1 | 1 | 250 | 50.00 | 250 |
| Customer2 | 1200 | Tier2 | 251 | 500 | 45.00 | 250 |
| Customer2 | 1200 | Tier3 | 501 | 1000 | 40.00 | 500 |
| Customer2 | 1200 | Tier4 | 1001 | 10000 | 30.00 | 200 |
| Customer3 | 40 | Tier1 | 1 | 250 | 50.00 | 40 |
| Customer4 | 2000 | Tier1 | 1 | 250 | 50.00 | 250 |
| Customer4 | 2000 | Tier2 | 251 | 500 | 45.00 | 250 |
| Customer4 | 2000 | Tier3 | 501 | 1000 | 40.00 | 500 |
| Customer4 | 2000 | Tier4 | 1001 | 10000 | 30.00 | 1000 |
| Customer5 | 700 | Tier1 | 1 | 250 | 50.00 | 250 |
| Customer5 | 700 | Tier2 | 251 | 500 | 45.00 | 250 |
| Customer5 | 700 | Tier3 | 501 | 1000 | 40.00 | 200 |
+-----------+----------+-------+----------+----------+-------+-------------------+