I want to calculate the cumulative sum of the column value
with a special requirement. After a quarter (= 3 months) has passed, the value
is depreciated by a constant, say 0.9
. That means, it now factors into the cumsum only by 90% of its original value. When another quarter has passed, this is repeated, i.e. the 90% are multiplied by the constant (0.9) again (this equals the original value * 0.9^2). For every quarter passing, this procedure is repeated so after a year 0.9^4 of the original value run into the cumsum, after two years, 0.9^8, and so on. The full original value only factors fully into the cumsum in the first quarter following its date variable.
Example: a value on 01/15 of 100 would add to the cumsum with 100
until 04/15, with 100*0.9=90
between 04/15 and 07/15, with 100*0.9^2=81
between 07/15 and 10/15, and so on, see desired_output
column of the MRE.
MRE:
df <- tribble(~date, ~value,~country,~desired_output,
"2017-01-01", 2, "US", 2,
"2017-01-05", 2, "UK", 8,
"2017-01-05", 4, "US", 8,
"2017-04-01", 5, "IT", 12.8, # only the first observation is older than a quarter so calculate as (2*0.9+2+4+5)
"2017-04-03", 3, "US", 115.8, #(2*0.9+2+4+5+3+100)
"2017-04-03", 100, "US", 115.8,
"2017-04-11", 20, "UK", 135.2, # now the first three observations are older than a quarter (8*0.9+5+3+100+20)
"2017-04-15", 6, "US", 141.2,
"2017-07-02", 30, "US", 170.52, # now the first observation is older than two quarters and the second, third, and forth observation are older than a quarter (2*0.9^2+11*0.9+3+100+20+6+30)
"2017-10-12", 4, "UK", 151.912, # ((2+2+4)*0.9^3+(5+3+100+20)*0.9^2+(6+30)*0.9+4+6)
"2017-10-12", 6, "IT", 151.912) # (8*0.9^3+128*0.9^2+36*0.9+10)
assuming the constant has a value of 0.9 and the format of the date is %Y-%b-%d
.
Pseudo-Code to further clarify the questions:
Take cumsum of each quarter/3 month period (from the date
of current row)
Exponentiate constant by distance of this period to the current row date (e.g. C^2 if quarter is two quarters back from today)
Multiply this adjusted constant by the cumsum of the respective quarter (3-month period)
Take cumsum of all quarter cumsums
In a next step, I would want to take this specific cumsum also by a grouping variable (e.g. country
).
Here is an option:
DT[, do :=
.SD[.SD, on=.(date<=date), by=.EACHI, {
nqtr <- floor(pmax(0, i.date - x.date) / 90)
sum(value * 0.9^nqtr)
}]$V1
]
output:
date value country desired_output do
1: 2017-01-01 2 US 2.000 2.000
2: 2017-01-05 2 UK 8.000 8.000
3: 2017-01-05 4 US 8.000 8.000
4: 2017-04-01 5 IT 12.800 12.800
5: 2017-04-03 3 US 115.800 115.800
6: 2017-04-03 100 US 115.800 115.800
7: 2017-04-11 20 UK 135.200 135.200
8: 2017-04-15 6 US 141.200 141.200
9: 2017-07-02 30 US 170.520 160.220
10: 2017-10-12 4 UK 151.912 151.372
11: 2017-10-12 6 IT 151.912 151.372
The difference is in how we define a quarter. I used 90d. If 3m is really important, I will update the post. For example on 2017-07-02, rows 2 - 6 are 1 qtr ago when using 90days whereas in your OP, only rows 2 - 4 are in a qtr ago when using 3m.
data:
library(data.table)
DT <- fread('date,value,country,desired_output
"2017-01-01", 2, "US", 2
"2017-01-05", 2, "UK", 8
"2017-01-05", 4, "US", 8
"2017-04-01", 5, "IT", 12.8
"2017-04-03", 3, "US", 115.8
"2017-04-03", 100, "US", 115.8
"2017-04-11", 20, "UK", 135.2
"2017-04-15", 6, "US", 141.2
"2017-07-02", 30, "US", 170.52
"2017-10-12", 4, "UK", 151.912
"2017-10-12", 6, "IT", 151.912')
DT[, date := as.IDate(date, format="%Y-%m-%d")]
Handling 3m and country requirement:
DT[, do :=
.SD[.SD, on=.(country, date<=date), by=.EACHI, {
vec <- rev(seq(i.date, min(x.date)-93L, by="-1 quarter"))
itvl <- findInterval(x.date, vec, rightmost.closed=TRUE)
nqtr <- length(vec) - itvl - 1L
sum(value * 0.9^nqtr)
}]$V1
]
output:
date value country desired_output do
1: 2017-01-01 2 US 2.000 2.000
2: 2017-01-05 2 UK 8.000 8.000
3: 2017-01-05 4 US 8.000 8.000
4: 2017-04-01 5 IT 12.800 13.000
5: 2017-04-03 3 US 115.800 115.800
6: 2017-04-03 100 US 115.800 115.800
7: 2017-04-11 20 UK 135.200 135.200
8: 2017-04-15 6 US 141.200 141.200
9: 2017-07-02 30 US 170.520 170.520
10: 2017-10-12 4 UK 151.912 151.912
11: 2017-10-12 6 IT 151.912 151.912