Search code examples
listloopsdplyrapplygroup

Loop operation on a list of dataframe


I have a extensive dataframe with > 100 k lines consisting of several sampling events at different stations and at different dates. Here is a simulated data set similar to what I have, but I have more columns that I also need to keep. Note that I do have multiple dates per station, just not here (it gives 500 date*station combinations).

    df <- data.frame("station" = rep(c("A", "B", "C", "D"), each = 4),
                     "date" = rep(c("2011-01-20", "2011-06-05", "2015-07-15", "2017-08-09"), each = 4),
                     "depth" = rep(c(1, 2, 3, 4), 4),
                     "temp" = runif(16))
    df

I need to calculate the delta temperature between each consecutive depths by date and station. So what I am expecting is column d_temp here

    df_expected <- data.frame("station" = rep(c("A", "B"), each = 4),
                     "date" = rep(c("2011-01-20", "2011-06-05"), each = 4),
                     "depth" = rep(c(1, 2, 3, 4), 2),
                     "temp" = runif(8),
                     "d_temp" = c("NA", 0.69-0.9, 0.63-0.69, 0.94-0.63, "NA", 0.72-0.55, 0.33-0.72, 0.81-0.33))
    df_expected

I tried spliting it all in list, but once there I am stuck, I tried using for loop on the list and lapply but I am not finding the solution to something that has to be simple.

Thank you for your help


Solution

  • Here's a dplyr solution:

    set.seed(234)
    df <- data.frame("station" = rep(c("A", "B", "C", "D"), each = 4),
                     "date" = rep(c("2011-01-20", "2011-06-05", "2015-07-15", "2017-08-09"), each = 4),
                     "depth" = rep(c(1, 2, 3, 4), 4),
                     "temp" = runif(16))
    
    library(dplyr)
    
    
    df %>%
      mutate(d_temp = temp - lag(temp, order_by = depth),
             .by = c(station, date))
    #>    station       date depth        temp        d_temp
    #> 1        A 2011-01-20     1 0.745619998            NA
    #> 2        A 2011-01-20     2 0.781712425  0.0360924273
    #> 3        A 2011-01-20     3 0.020037114 -0.7616753110
    #> 4        A 2011-01-20     4 0.776085387  0.7560482735
    #> 5        B 2011-06-05     1 0.066910093            NA
    #> 6        B 2011-06-05     2 0.644795124  0.5778850310
    #> 7        B 2011-06-05     3 0.929385959  0.2845908350
    #> 8        B 2011-06-05     4 0.717642189 -0.2117437709
    #> 9        C 2015-07-15     1 0.927736510            NA
    #> 10       C 2015-07-15     2 0.284230120 -0.6435063903
    #> 11       C 2015-07-15     3 0.555724930  0.2714948107
    #> 12       C 2015-07-15     4 0.547701653 -0.0080232776
    #> 13       D 2017-08-09     1 0.582847855            NA
    #> 14       D 2017-08-09     2 0.582989913  0.0001420584
    #> 15       D 2017-08-09     3 0.001198341 -0.5817915718
    #> 16       D 2017-08-09     4 0.441117854  0.4399195127
    

    Created on 2023-08-03 with reprex v2.0.2