I have three columns in a table, customerId, date and storedvalue (amount in the wallet of a customer) I need to count the days between a person loads cash in to his account and exhaust it to zero. For example, if a person loads cash on 2024-01-01 and he finishes it on 2024-01-05 then the day count should be 4. If he loads it back again then the count starts again.
Cistomerid | date | storedvalue |
---|---|---|
1234567 | 2024-01-01 | 100 |
1234567 | 2024-01-02 | 55 |
1234567 | 2024-01-03 | 45 |
1234567 | 2024-01-04 | 67 |
1234567 | 2024-01-05 | 0 |
1234567 | 2024-01-06 | 300 |
1234567 | 2024-01-07 | 100 |
1234567 | 2024-01-08 | 150 |
1234567 | 2024-01-09 | 0 |
The result needs to be like
CustomerId | date diff |
---|---|
1235567 | 4 |
1234567 | 3 |
In the above example the customer added 100rs to the wallet on 2024-01-01 and spent it to zero by 2024-01-04. the date diff between the day money is added and the day money got exhausted should be calculated and it's 4 days in the first instance. If the money is again added to the wallet after its zero that should be calculated as second instance and it's day 1 again and again he spends all the money by 2024-01-09 which will be 3 days. The partition should happen on 0. He can add and spend any number of times but once the account is zero the previous count ends and new count starts.
Thanks in advance!!
I tried the following script but it returns only the data with amount zero
WITH ranked_data AS (
SELECT
customerID,
date,
storedvalue,
ROW_NUMBER() OVER (PARTITION BY customerID ORDER BY date) AS row_num,
ROW_NUMBER() OVER (PARTITION BY customerID, storedvalue ORDER BY date) AS value_change_num
FROM
your_table_name
)
SELECT
customerID,
MIN(date) AS start_date,
MAX(date) AS end_date,
DATEDIFF(day, MIN(date), MAX(date)) AS days_between
FROM
ranked_data
WHERE
storedvalue = 0
GROUP BY
customerID,
storedvalue,
row_num - value_change_num
ORDER BY
customerID,
start_date;
First is to identify the related sets of rows which is when the storedValue
turns 0
. Using the lag()
window function to get the previous row value and case
expression to check the value is 0
. Performing a cumulative sum() over ()
gives you the required grouping
select Cistomerid,
datediff(day, min(date), max(date))
from
(
select *, grp = sum (case when prevValue = 0 then 1 else 0 end)
over (partition by Cistomerid order by date)
from
(
select *,
prevValue = lag (storedvalue, 1, 0)
over (partition by Cistomerid order by date)
from ranked_data
) r
) r
group by Cistomerid, grp