I work with a data frame that that looks like that:
DATUM CP SMER TRH MNOZSTVI CENA POPLATKY OBJEM UCET KVARTAL ROK AKTUALNI.MNOZSTVI
<dttm> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <dbl> <dbl>
1 2020-03-03 00:00:00 CEZ K BCPP 50 465. 91.3 -23240 CZK Q1 2020 NA
2 2020-03-04 00:00:00 CEZ K BCPP 50 467. 58.9 -13980 CZK Q1 2020 NA
3 2020-03-12 00:00:00 CEZ P BCPP 30 398 51.8 11940 CZK Q1 2020 NA
4 2020-03-25 00:00:00 KOMERCNI BANKA K BCPP 40 542 85.9 -21680 CZK Q1 2020 NA
5 2020-03-25 00:00:00 MONETA MONEY BANK K BCPP 300 58.4 71.3 -17505 CZK Q1 2020 NA
6 2020-03-30 00:00:00 CEZ K BCPP 10 391 50 -3910 CZK Q1 2020 NA
7 2020-04-02 00:00:00 USD K NA 1000 25.8 0 -25778 CZK Q2 2020 NA
8 2020-04-03 00:00:00 USD K NA 3000 26.1 0 -78392 CZK Q2 2020 NA
9 2020-04-04 00:00:00 USD K NA 1000 26.4 0 -26363. CZK Q2 2020 NA
10 2020-04-06 00:00:00 AVAST K BCPP 150 125. 75.8 -18810 CZK Q2 2020 NA
And I would like to fill cumulative sum of variable MNOZSTVI into variable AKTUALNI.MNOZSTVI grouped by CP. So vector AKTUALNI.MNOZSTVI should be c(50,100,130,40,300,140,1000,4000,5000,150, etc.).
The problem is that some values of MNOZSTVI are missing and so I dont know how to use function cumsun() that can't cope with missing values + I struggle to perform it for grouped data.
Does anybody know how to do that either woth the help of cumsum() or some other function? Thank you.
We can group by 'CP' and get the cumsum
of 'MNOZSTVI' in mutate
library(dplyr)
df1 <- df1 %>%
group_by(CP) %>%
mutate(AKTUALNI.MNOZSTVI = cumsum(MNOZSTVI))
Or use base R
with ave
df1$AKTUALNI.MNOZSTVI <- with(df1, ave(MNOZSTVI, CP, FUN = cumsum))