I have a table like the below. For each date there is a have a product. And I want to aggregate by the product and carry forward the total. But the total is limited to 0. For each day and product, I want to aggregate and carry forward the aggregate.
Date | product | Amount |
---|---|---|
06/Jan/2021 | a | 300 |
05/Jan/2021 | a | 200 |
05/Jan/2021 | b | 500 |
04/Jan/2021 | a | -100 |
04/Jan/2021 | b | 1100 |
03/Jan/2021 | a | -500 |
03/Jan/2021 | b | -900 |
02/Jan/2021 | a | 200 |
02/Jan/2021 | b | 500 |
01/Jan/2021 | a | 100 |
For the latest date I want to see the output as below:
Date | product | total |
---|---|---|
06/Jan/2021 | a | 500 |
06/Jan/2021 | b | 1600 |
Check out the script below. Ive annotated the code to explain each step.
If you reload the script the result table will be:
The basic idea is:
< 0
then return 0, else continue with the aggregationAnnotated load script:
RawData:
Load * inline [
Date , product, Amount
06/Jan/2021, a , 300
05/Jan/2021, a , 200
05/Jan/2021, b , 500
04/Jan/2021, a , -100
04/Jan/2021, b , 1100
03/Jan/2021, a , -500
03/Jan/2021, b , -900
02/Jan/2021, a , 200
02/Jan/2021, b , 500
01/Jan/2021, a , 100
];
NoConcatenate
// Order the data by product and date in ascending order
OrderedData:
Load
date(date#(Date, 'DD/MMM/YYYY')) as Date, // convert to date
product,
Amount
Resident
RawData
Order By
product,
Date asc
;
// we dont need RawData table anymore
Drop Table RawData;
// CalculatedAmount calculation/logic:
// * if the current value for product is not equal to the prvious value
// get the current amount. This the case where the producs are "switched"
// * if the sum of the current Amount value + the previous CalculatedAmount value
// is less than 0 then return 0
// * for the rest of the cases - return the aggregated amount - current Amount
// added to the previous CalculatedAmount
CalculatedAmount:
Load
Date,
product,
Amount,
if( product <> peek('product'), Amount,
if( Amount + peek('CalculatedAmount') < 0, 0,
Amount + peek('CalculatedAmount')
)) as CalculatedAmount
Resident
OrderedData
;
// we dont need OrderedData anymore
Drop Table OrderedData;
// use the CalculatedAmount table to find the max date for each product
// these records are containing the total amount for each product
// flag these records with 1 and join the result table back to
// CalculatedAmount table
join (CalculatedAmount)
Load
max(Date) as Date,
product,
1 as isTotalAmount
Resident
CalculatedAmount
Group By
product
;
// the final table - filter CalculatedAmount table to return
// only the records for which isTotalAmount = 1
TotalAmount:
Load
Date,
product,
CalculatedAmount as TotalAmount
Resident
CalculatedAmount
Where
isTotalAmount = 1
;
// we dont need CalculatedAmount table anymore
Drop Table CalculatedAmount;