Search code examples

Apply cumsum function to a variable with several conditions

I am having similar data to this one:

data <- data.frame (date=seq.Date(as.Date("2021-03-21"),as.Date("2021-04-21"),"day"),
                    rad= sample(1:10,32, replace = T))

> head(data)
        date rad
1 2021-03-21   1
2 2021-03-22   5
3 2021-03-23   1
4 2021-03-24   9
5 2021-03-25  10
6 2021-03-26   4

I am currently learning to twist and manipulate big datasets and stumbled upon a case where my R knowledge/googling skills are just not helping anymore.

I would like to learn two things:

  1. How to assign values to a variable for a certain period. Let's say for example, I want to give a value of 42 to all elements from the rad column for the time period between 2021-04-01 and 2021-04-05.

  2. More importantly (and unrelatedly to 1. ) I would like to create a code that:

  • creates a new column based on the "rad" variable
  • calculates the cumsum of "rad" for a certain time period (e.g. 2021-04-01 - 2021-04-05)
  • Then takes the last value of the cumsum (the total sum of the cumulative summed period) and assigns it for a certain time period (e.g. 2021-04-06 - 2021-04-15)
  • The new variable has the same values as the "rad" variable for rest of the dates to which no function is applied

I simply do not know how to present you the desired output.


  • You could use a boolean vector to specify the rows you want to modify:

    data <- data.frame(date=seq.Date(as.Date("2021-03-21"),as.Date("2021-04-21"),"day"),
                        rad= sample(1:10,32, replace = T))
    # Specify rows to be modified
    modified <- data$date>=as.Date('2021-04-01') & data$date<=as.Date('2021-04-05')
    modified.after <- data$date>as.Date('2021-04-06') & data$date<=as.Date('2021-04-15')
    # First question
    # data$rad[modified] <- 42
    # Second question
    data$radnew <- data$rad
    cs <- cumsum(data$rad[modified])
    data$radnew[modified] <- cs
    data$radnew[modified.after] <- tail(cs,1) 