Search code examples
sqlpostgresqlanalytics

Divide a number into multiple bands


I've been dealing with a problem of splitting a metric into several bands. To give you some context, let's take this example where we have certain number of orders per customer. Now, a customer may order n number of products. Let's give the customer certain discount based on the number of orders. The discounts are offered based a tiered model. I'm leaving out multiple product categories to keep it simple. Here are some examples of the tables.

Orders table

Customer    |   order_no
----------------------------
Customer1   |   400
Customer2   |   1200
Customer3   |   40
Customer4   |   2000
Customer5   |   700

Tiered pricing table

Tier   | lower_th | higer_th | price |
--------------------------------------
Tier1  | 0        | 250      | 50    |
TIer2  | 251      | 500      | 45    |
Tier3  | 501      | 1000     | 40    |
TIer4  | 1001     | 10000    |  30   |

Example1: I want to be able to charge Customer1 $50 for 250 order and $45 for the rest of 150 products out of a total of 400.

Example2: I want to be able to charge Customer5 $50 for 250 order and $45 for another 250 and $40 for the rest 200 products out of a total of 700.

How do I achieve this in PostgreSQL? My output needs to be the following for Customer1. What's the best way to split the total number of orders and join it to the pricing tiers to get the corresponding amount?

Customer  | order_no | charges |
--------------------------------
Customer1 | 250      | 50      |
Customer1 | 150      | 45      |

Solution

  • You can think of your tiers as intervals.

    Two intervals [a1, b1] and [a2, b2] intersect when

    a1 <= b2 AND b1 >= a2
    

    The number of orders is another interval that always starts at 1.

    Your two intervals are: Tiers [lower_th, higer_th] and Orders [1, order_no].

    The query is a simple join using this intersection expression:

    SELECT *
        ,CASE WHEN O.order_no > T.higer_th
        THEN T.higer_th - T.lower_th + 1 -- full tier
        ELSE O.order_no - T.lower_th + 1
        END AS SplitOrderNumbers
    FROM
        Orders AS O
        INNER JOIN Tiers AS T
            -- ON 1 <= T.higer_th AND O.order_no >= T.lower_th
            ON O.order_no >= T.lower_th
    ORDER BY
        O.Customer
        ,T.lower_th
    ;
    

    You don't really need the 1 <= T.higer_th part, because it is always true, so the expression becomes simple O.order_no >= T.lower_th.

    Also, usually it is better to store intervals as [closed; open). It usually simplifies arithmetic, similar to why most programming languages have array indexes starting at 0, not 1. Your intervals seem to be [closed; closed]. In this case you need to set lower_th to 1, not 0 and have +1 in the calculations.

    With this adjustment of the sample data this query produces the following result:

    +-----------+----------+-------+----------+----------+-------+-------------------+
    | Customer  | order_no | Tier  | lower_th | higer_th | price | SplitOrderNumbers |
    +-----------+----------+-------+----------+----------+-------+-------------------+
    | Customer1 |      400 | Tier1 |        1 |      250 | 50.00 |               250 |
    | Customer1 |      400 | Tier2 |      251 |      500 | 45.00 |               150 |
    | Customer2 |     1200 | Tier1 |        1 |      250 | 50.00 |               250 |
    | Customer2 |     1200 | Tier2 |      251 |      500 | 45.00 |               250 |
    | Customer2 |     1200 | Tier3 |      501 |     1000 | 40.00 |               500 |
    | Customer2 |     1200 | Tier4 |     1001 |    10000 | 30.00 |               200 |
    | Customer3 |       40 | Tier1 |        1 |      250 | 50.00 |                40 |
    | Customer4 |     2000 | Tier1 |        1 |      250 | 50.00 |               250 |
    | Customer4 |     2000 | Tier2 |      251 |      500 | 45.00 |               250 |
    | Customer4 |     2000 | Tier3 |      501 |     1000 | 40.00 |               500 |
    | Customer4 |     2000 | Tier4 |     1001 |    10000 | 30.00 |              1000 |
    | Customer5 |      700 | Tier1 |        1 |      250 | 50.00 |               250 |
    | Customer5 |      700 | Tier2 |      251 |      500 | 45.00 |               250 |
    | Customer5 |      700 | Tier3 |      501 |     1000 | 40.00 |               200 |
    +-----------+----------+-------+----------+----------+-------+-------------------+