I have a dataset with four columns and want to create a fifth column, which will be the cumulative sum of the fourth.
The dataset:
Col1 Col2 Col3 Col4
2017-12-26 0.000000e+00 NaN NaN
2017-12-27 3.295056e-05 NaN NaN
2017-12-28 2.998402e-03 NaN NaN
2017-12-29 -3.777862e-03 NaN NaN
2018-01-01 0.000000e+00 NaN NaN
2018-01-02 -6.265354e-04 0.001210748 -1.837284e-03
2018-01-03 1.501328e-03 0.001505680 -4.351903e-06
2018-01-04 1.103716e-03 0.001266719 -1.630024e-04
2018-01-05 7.437758e-03 0.005948840 1.488918e-03
2018-01-08 1.323032e-03 0.000949433 3.735992e-04
The dput:
structure(list(date = structure(c(17526, 17527, 17528, 17529,
17532, 17533, 17534, 17535, 17536, 17539, 17540, 17541), class = "Date"),
S.P.ASX.200 = c(0, 3.29505576881495e-05, 0.00299840195060885,
-0.00377786173026062, 0, -0.000626535423983166, 0.0015013280979328,
0.00110371639430684, 0.00743775814121861, 0.00132303219378338,
0.000880856061594715, -0.00637243717200697), portfolio.average = c(NaN,
NaN, NaN, NaN, NaN, 0.00121074842324788, 0.00150568000136751,
0.00126671875198612, 0.00594883984640553, 0.000949433018286304,
-0.00477888742856801, -0.007316033066539), marketed_adjusted_return = c(NaN,
NaN, NaN, NaN, NaN, -0.00183728384723105, -4.35190343470821e-06,
-0.000163002357679275, 0.00148891829481308, 0.000373599175497076,
0.00565974349016273, 0.000943595894532028)), row.names = 612:623, class = "data.frame")
I've tried using the below function but it returns the fifth column with only NaNs.
cumulative_a <- a %>%
mutate(cumulative = cumsum(a[,4]))
Is there a way to perform the cumsum whilst ignoring the NaN values?
You can replace NA
to 0 and then use cumsum
:
df$cum_a <- cumsum(replace(df$marketed_adjusted_return,
is.na(df$marketed_adjusted_return), 0))