Search code examples
rdataframerep

Insert rows in a dataframe based on a calculation


I am pretty new to R and I am trying to copy a calculation done in Excel to R.
I have a data frame like this:

Component <- c("A", "B", "C")
Report_Time <- c(5781, 5781, 5781)
Interval <- c(700, 600, 800)
End_Time <- c(8281, 8281, 8281)
Start_Time <- c(800, 298, 780)
df <- data.frame(Component, Report_Time, Interval, End_Time, Start_Time)

When Printed it looks like this:

# Component Report_Time Interval    End_Time    Start_Time
#1    A         5781        700         8281        800
#2    B         5781        600         8281        298
#3    C         5781        800         8281        780

For each component, I want to populate a calculated column "Interval_Time", which is the sum of Start Time + Report_Time for First, then if it is less than End_Time the insert a row with the sum of Interval_Time (Last sum) + Interval. Repeat inserting till the sum in Interval time is less than End_Time.

# Component Report_Time Interval    End_Time    Start_Time  Interval_Time
#1   A       5781       700             8281        800         6581
#2   A       5781       700             8281        800         7281
#3   A       5781       700             8281        800         7981
#4   B       5781       1000            8281        298         6079        
#5   B       5781       1000            8281        298         7079
#6   B       5781       1000            8281        298         8079
#7   C       5781       1200            8281        780         6561
#8   C       5781       1200            8281        780         7761

I have been trying to achive this with if inside a for loop.. but haven't been succesfull.


Solution

  • With data.table:

    Component <- c("A", "B", "C")
    Report_Time <- c(5781, 5781, 5781)
    Interval <- c(700, 1000, 1200)
    End_Time <- c(8281, 8281, 8281)
    Start_Time <- c(800, 298, 780)
    df <- data.frame(Component, Report_Time, Interval, End_Time, Start_Time)
    
    library(data.table)
    setDT(df)
    df<-df[rep(1:.N,ceiling((End_Time-Start_Time-Report_Time)/Interval))]
    df[,Interval_Time:=ifelse(.I==1,Start_Time+Report_Time,Start_Time+cumsum(Interval)+Report_Time-Interval),by=.(Component)]
    
    df
    Component Report_Time Interval End_Time Start_Time Interval_Time
    1:         A        5781      700     8281        800          6581
    2:         A        5781      700     8281        800          7281
    3:         A        5781      700     8281        800          7981
    4:         B        5781     1000     8281        298          6079
    5:         B        5781     1000     8281        298          7079
    6:         B        5781     1000     8281        298          8079
    7:         C        5781     1200     8281        780          6561
    8:         C        5781     1200     8281        780          7761