Search code examples
t-sqlazure-synapseazure-synapse-analytics

Calculate the revenue by year and by month based on details from another table between a startdate and endate


In a customer table, I have information about each customer and their subscription type. The table also shows when the subscription/membership began and when the next payment is due based on the subscription/membership type. Below is the table sample -

Customer Table:

Customer_ID Charge_ID Membership_Type Membership_StartDate Membership_EndDate Membership_Status Next_Payment_Date Gross
cus_OzcGFMWjj7e6wf ch_3OBd1JJuo27Vzw6d0tzOgF7c Membership £216 Yearly 2023-11-12 NULL Active 2024-11-12 £216.00
cus_OynI9WYzFKG7ZJ ch_3OApgkJuo27Vzw6d0bOJgdYf Membership £5 Monthly 2023-11-10 NULL Active 2023-12-10 £5.00
cus_OzAOs3xIx4p3vd ch_3OBC21Juo27Vzw6d1AGyK9LQ Membership £5 Monthly 2023-11-11 NULL Active 2023-12-11 £5.00
cus_OzXKT687CWwQuS ch_3OBYFiJuo27Vzw6d1FmiNkGm Membership £5 Monthly 2023-11-12 NULL Active 2023-12-12 £5.00
cus_OzNXmqgA4I7zy2 ch_3OBOkMJuo27Vzw6d0neOzJ13 Membership £10 Monthly 2023-11-11 NULL Active 2023-12-11 £10.00

My next step is to create a new table called Revenue, which should contain the following information. Depending on the type of subscription or membership, this table shows earned and deferred revenue for each customer by year and month. How can I achieve the below table on Azure Synapse Dedicated SQL Pool?

Revenue Table:

Customer_ID Customer_Name Membership_Type Year January February March April May June July August September October November December
cus_OzcGFMWjj7e6wf ch_3OBd1JJuo27Vzw6d0tzOgF7c Membership £216 Yearly 2023 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL 11.21311475 18.29508197
cus_OzcGFMWjj7e6wf ch_3OBd1JJuo27Vzw6d0tzOgF7c Membership £216 Yearly 2024 18.29508197 17.1147541 18.29508197 17.70491803 18.29508197 17.70491803 18.29508197 18.29508197 17.70491803 18.29508197 6.491803279 NULL
cus_OynI9WYzFKG7ZJ ch_3OApgkJuo27Vzw6d0bOJgdYf Membership £5 Monthly 2023 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL 3.5 1.5
cus_OzAOs3xIx4p3vd ch_3OBC21Juo27Vzw6d1AGyK9LQ Membership £5 Monthly 2023 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL 3.333333333 1.666666667
cus_OzXKT687CWwQuS ch_3OBYFiJuo27Vzw6d1FmiNkGm Membership £5 Monthly 2023 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL 3.166666667 1.833333333
cus_OzNXmqgA4I7zy2 ch_3OBOkMJuo27Vzw6d0neOzJ13 Membership £10 Monthly 2023 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL 6.666666667 3.333333333
cus_P087tismWgTMyM ch_3OC7qDJuo27Vzw6d0hOuCQT2 Membership £54 Yearly 2023 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL 2.454545455 4.227272727
cus_P087tismWgTMyM ch_3OC7qDJuo27Vzw6d0hOuCQT2 Membership £54 Yearly 2024 4.227272727 3.954545455 4.227272727 4.090909091 4.227272727 4.090909091 4.227272727 4.227272727 4.090909091 4.227272727 4.090909091 1.636363636
  1. My first step was to create a script to calculate the revenue amount for an individual customer. To explain my script below I took details of customer_id - cus_OzcGFMWjj7e6wf as an input to this script. If you can help me figure out how to use the below script for each customer in my customer table to create my revenue table, that would be greatly appreciated.
declare @Membership_StartDate date;
declare @Membership_EndDate date;
declare @Gross INT;

set @Membership_StartDate = '2023-11-12';
set @Membership_EndDate = '2024-11-12';
set @Gross = 216.000000;

With FirstQuery AS
(
select year(dt) [Year], month(dt) [Month], count(*) NumberofDays
from (select top(datediff(d, dateadd(dd,-1,@Membership_StartDate), dateadd(dd,-1,@Membership_EndDate))) dateadd(d, row_number() over (order by (select null)), dateadd(dd,-1,@Membership_StartDate)) dt
from sys.columns) q
group by year(dt), month(dt)
)   
select Year
    , Jan = MAX(case when Month = 1 then (1.0*NumberofDays)/(1.0*DATEDIFF(day, @Membership_StartDate, @Membership_EndDate))*(1.0*@Gross) end)
    , Feb = MAX(case when Month = 2 then (1.0*NumberofDays)/(1.0*DATEDIFF(day, @Membership_StartDate, @Membership_EndDate))*(1.0*@Gross) end)
    , Mar = MAX(case when Month = 3 then (1.0*NumberofDays)/(1.0*DATEDIFF(day, @Membership_StartDate, @Membership_EndDate))*(1.0*@Gross) end)
    , Apr = MAX(case when Month = 4 then (1.0*NumberofDays)/(1.0*DATEDIFF(day, @Membership_StartDate, @Membership_EndDate))*(1.0*@Gross) end)
    , May = MAX(case when Month = 5 then (1.0*NumberofDays)/(1.0*DATEDIFF(day, @Membership_StartDate, @Membership_EndDate))*(1.0*@Gross) end)
    , Jun = MAX(case when Month = 6 then (1.0*NumberofDays)/(1.0*DATEDIFF(day, @Membership_StartDate, @Membership_EndDate))*(1.0*@Gross) end)
    , Jul = MAX(case when Month = 7 then (1.0*NumberofDays)/(1.0*DATEDIFF(day, @Membership_StartDate, @Membership_EndDate))*(1.0*@Gross) end)
    , Aug = MAX(case when Month = 8 then (1.0*NumberofDays)/(1.0*DATEDIFF(day, @Membership_StartDate, @Membership_EndDate))*(1.0*@Gross) end)
    , Sep = MAX(case when Month = 9 then (1.0*NumberofDays)/(1.0*DATEDIFF(day, @Membership_StartDate, @Membership_EndDate))*(1.0*@Gross) end)
    , Oct = MAX(case when Month = 10 then (1.0*NumberofDays)/(1.0*DATEDIFF(day, @Membership_StartDate, @Membership_EndDate))*(1.0*@Gross) end)
    , Nov = MAX(case when Month = 11 then (1.0*NumberofDays)/(1.0*DATEDIFF(day, @Membership_StartDate, @Membership_EndDate))*(1.0*@Gross) end)
    , Dec = MAX(case when Month = 12 then (1.0*NumberofDays)/(1.0*DATEDIFF(day, @Membership_StartDate, @Membership_EndDate))*(1.0*@Gross) end)    
from FirstQuery
group by Year

Solution

  • If we reimagine your tables to be relational, this is a fairly straightforward pivot.

    Consider:

    DECLARE @Customers TABLE (CustomerID INT IDENTITY, FirstName NVARCHAR(50), LastName NVARCHAR(50));
    INSERT INTO @Customers (FirstName, LastName) VALUES 
    ('James','Kirk'), ('Leonard','McCoy'), ('Montgomery','Scott'), ('Nyota','Uhura');
    
    DECLARE @CustomerMemberships TABLE (CustomerID INT, MembershipType INT, Active BIT, BillCycle TINYINT, StartDate DATE, EndDate DATE);
    INSERT INTO @CustomerMemberships (CustomerID, MembershipType, Active, BillCycle, StartDate, EndDate) VALUES
    (1,1,1, 10, '2024-01-01', '2024-12-31'), (2,1,1, 11, '2024-01-01', '2025-06-30'), (3,2,1, 12, '2025-01-01', '2026-03-31'), (4,3,1, 5, '2024-01-01', '2030-12-31');
    
    DECLARE @Memberships TABLE (MembershipType INT IDENTITY, MembershipName NVARCHAR(50), AnnualBilling BIT, MonthlyBilling BIT, Cost DECIMAL(10,3));
    INSERT INTO @Memberships (MembershipName, AnnualBilling, MonthlyBilling, Cost) VALUES
    ('Monthly Basic', 0, 1, 5.00), ('Monthly Premium', 0, 1, 10.00), ('Annual Basic', 1, 0, 216);
    
    ;WITH Dates AS (
    SELECT DATEADD(MONTH,1,GETDATE()) AS Date
    UNION ALL
    SELECT DATEADD(MONTH,1,Date)
      FROM Dates
     WHERE Date < '2030-12-01'
    ), Revenue AS (
    SELECT DATEFROMPARTS(DATEPART(YEAR,Date), DATEPART(MONTH,Date), 1) AS Month,
           CASE WHEN AnnualBilling = 1 THEN m.Cost / 12.0
                WHEN MonthlyBilling = 1 THEN m.Cost
            END AS Cost, c.CustomerID, C.FirstName, C.LastName, m.MembershipName
      FROM @Customers c
        CROSS APPLY Dates y
        INNER JOIN @CustomerMemberships cm
          ON c.CustomerID = cm.CustomerID
          AND y.Date BETWEEN cm.StartDate AND CM.EndDate
        INNER JOIN @Memberships m
          ON cm.MembershipType = m.MembershipType
    )
    SELECT *
      FROM (
    
            SELECT FirstName, LastName, DATEPART(YEAR,Month) AS Year, DATENAME(MONTH,Month) AS Month, Cost
              FROM Revenue
             WHERE Month IS NOT NULL
           ) a
        PIVOT (
               SUM(Cost) FOR Month IN (January, February, March, April, May, June, July, August, September, October, November, December)
              ) p;
    
    FirstName LastName Year January February March April May June July August September October November December
    James Kirk 2024 5.0000000 5.0000000 5.0000000 5.0000000 5.0000000 5.0000000 5.0000000 5.0000000 5.0000000 5.0000000 5.0000000 5.0000000
    Leonard McCoy 2024 5.0000000 5.0000000 5.0000000 5.0000000 5.0000000 5.0000000 5.0000000 5.0000000 5.0000000 5.0000000 5.0000000 5.0000000
    Leonard McCoy 2025 5.0000000 5.0000000 5.0000000 5.0000000 5.0000000 5.0000000
    Montgomery Scott 2025 10.0000000 10.0000000 10.0000000 10.0000000 10.0000000 10.0000000 10.0000000 10.0000000 10.0000000 10.0000000 10.0000000 10.0000000
    Montgomery Scott 2026 10.0000000 10.0000000 10.0000000
    Nyota Uhura 2024 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000
    Nyota Uhura 2025 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000
    Nyota Uhura 2026 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000
    Nyota Uhura 2027 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000
    Nyota Uhura 2028 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000
    Nyota Uhura 2029 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000
    Nyota Uhura 2030 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000 18.0000000

    The table objects described at the top of this script are relational and can describe the relationship between a customer, their membership and membership types. You can consider each of the columns as distinct properties of those objects.

    In the COMMON TABLE EXPRESSION (CTE) 'Dates' we're producing a row for each month between now and the end of the period (2023-12-01). In the following CTE Revenue we prepare the data to be consumed by a pivot, producing a row for each month a customer has a membership, and determining the amount of attributable revenue they have - for monthly plans we just take the cost. For annual plans we take the cost divided by 12.

    Finally, we perform the pivot on the revenue CTE to produce a data set which contains a column for each month that we have data for, for each customer and each year they have an active subscription.

    I would strongly suggest you don't use strings for your identifiers, or if you must, also include a numeric surrogate key.