Search code examples
rdplyrtime-serieslubridatezoo

Loop function on timeseries works on small df, but not in large df - Error: C stack usage...too close to the limit


I have a dataframe with dates/times (time series), site (grouping var) and value. I have identified the start times of different 'surges' - defined as changes in values of >=2 in 15 mins. For each surge time, I am trying for the date/time where the value falls back down to (or below) the start of the surge (i.e., the end of the surge).

I can achieve this by using a recursive loop function ('find.next.smaller' from this question - In a dataframe, find the index of the next smaller value for each element of a column). This works perfectly on a smaller dataframe, but not a large one. I get the error message "Error: C stack usage 15925584 is too close to the limit". Having seen other similar questions (e.g., Error: C stack usage is too close to the limit), I do not think its a problem of an infinite recursive function, but a memory issue. But I do not know how to use shell (or powershell) to do this. I wondered whether there was any other way? Either through adapting my memory or the function below?

Some example code:

###df formatting    
library(dplyr)
df <- data.frame("Date_time" =seq(from=as.POSIXct("2022-01-01 00:00") , by= 15*60, to=as.POSIXct("2022-01-01 07:00")), 
             "Site" = rep(c("Site A", "Site B"), each = 29),
             "Value" = c(10,10.1,10.2,10.3,12.5,14.8,12.4,11.3,10.3,10.1,10.2,10.5,10.4,10.3,14.7,10.1,
                         16.7,16.3,16.4,14.2,10.2,10.1,10.3,10.2,11.7,13.2,13.2,11.1,11.4,
                         rep(10.3,times=29)))
df <- df %>% group_by(Site) %>% mutate(Lead_Value = lead(Value))
df$Surge_start <- NA
df[which(df$Lead_Value - df$Value >=2),"Surge_start"] <- 
 paste("Surge",seq(1,length(which(df$Lead_Value - df$Value >=2)),1),sep="")

###Applying the 'find.next.smaller' function

find.next.smaller <- function(ini = 1, vec) {
if(length(vec) == 1) NA 
else c(ini + min(which(vec[1] >= vec[-1])), 
     find.next.smaller(ini + 1, vec[-1]))
}       # the recursive function will go element by element through the vector and find out 
# the index of the next smaller value.
df$Date_time <- as.character(df$Date_time)
Output <- df %>% group_by(Site) %>% mutate(Surge_end = ifelse(grepl("Surge",Surge_start),Date_time[find.next.smaller(1, Value)],NA))
###This works fine

df2 <- do.call("rbind", replicate(1000, df, simplify = FALSE))
Output2 <- df2 %>% group_by(Site) %>% mutate(Surge_end = ifelse(grepl("Surge",Surge_start),Date_time[find.next.smaller(1, Value)],NA))
####This does not work

Solution

  • I suggest you don't need recursion.

    find_nearest_value <- function(surge, time1, val1, times, vals) {
      if (!grepl("Surge", surge)) NA else times[times > time1 & vals <= val1][1]
    }
    
    Output %>%
      group_by(Site) %>%
      mutate(end2 = if_else(grepl("Surge", Surge_start), mapply(find_nearest_value, Surge_start, Date_time, Value, list(Date_time), list(Value)), NA)) %>%
      print(n=99)
    # # A tibble: 58 × 7
    # # Groups:   Site [2]
    #    Date_time           Site   Value Lead_Value Surge_start Surge_end           end2               
    #    <chr>               <chr>  <dbl>      <dbl> <chr>       <chr>               <chr>              
    #  1 2022-01-01 00:00:00 Site A  10         10.1 NA          NA                  NA                 
    #  2 2022-01-01 00:15:00 Site A  10.1       10.2 NA          NA                  NA                 
    #  3 2022-01-01 00:30:00 Site A  10.2       10.3 NA          NA                  NA                 
    #  4 2022-01-01 00:45:00 Site A  10.3       12.5 Surge1      2022-01-01 02:00:00 2022-01-01 02:00:00
    #  5 2022-01-01 01:00:00 Site A  12.5       14.8 Surge2      2022-01-01 01:30:00 2022-01-01 01:30:00
    #  6 2022-01-01 01:15:00 Site A  14.8       12.4 NA          NA                  NA                 
    #  7 2022-01-01 01:30:00 Site A  12.4       11.3 NA          NA                  NA                 
    #  8 2022-01-01 01:45:00 Site A  11.3       10.3 NA          NA                  NA                 
    #  9 2022-01-01 02:00:00 Site A  10.3       10.1 NA          NA                  NA                 
    # 10 2022-01-01 02:15:00 Site A  10.1       10.2 NA          NA                  NA                 
    # 11 2022-01-01 02:30:00 Site A  10.2       10.5 NA          NA                  NA                 
    # 12 2022-01-01 02:45:00 Site A  10.5       10.4 NA          NA                  NA                 
    # 13 2022-01-01 03:00:00 Site A  10.4       10.3 NA          NA                  NA                 
    # 14 2022-01-01 03:15:00 Site A  10.3       14.7 Surge3      2022-01-01 03:45:00 2022-01-01 03:45:00
    # 15 2022-01-01 03:30:00 Site A  14.7       10.1 NA          NA                  NA                 
    # 16 2022-01-01 03:45:00 Site A  10.1       16.7 Surge4      2022-01-01 05:15:00 2022-01-01 05:15:00
    # 17 2022-01-01 04:00:00 Site A  16.7       16.3 NA          NA                  NA                 
    # 18 2022-01-01 04:15:00 Site A  16.3       16.4 NA          NA                  NA                 
    # 19 2022-01-01 04:30:00 Site A  16.4       14.2 NA          NA                  NA                 
    # 20 2022-01-01 04:45:00 Site A  14.2       10.2 NA          NA                  NA                 
    # 21 2022-01-01 05:00:00 Site A  10.2       10.1 NA          NA                  NA                 
    # 22 2022-01-01 05:15:00 Site A  10.1       10.3 NA          NA                  NA                 
    # 23 2022-01-01 05:30:00 Site A  10.3       10.2 NA          NA                  NA                 
    # 24 2022-01-01 05:45:00 Site A  10.2       11.7 NA          NA                  NA                 
    # 25 2022-01-01 06:00:00 Site A  11.7       13.2 NA          NA                  NA                 
    # 26 2022-01-01 06:15:00 Site A  13.2       13.2 NA          NA                  NA                 
    # 27 2022-01-01 06:30:00 Site A  13.2       11.1 NA          NA                  NA                 
    # 28 2022-01-01 06:45:00 Site A  11.1       11.4 NA          NA                  NA                 
    # 29 2022-01-01 07:00:00 Site A  11.4       NA   NA          NA                  NA                 
    # 30 2022-01-01 00:00:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 31 2022-01-01 00:15:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 32 2022-01-01 00:30:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 33 2022-01-01 00:45:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 34 2022-01-01 01:00:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 35 2022-01-01 01:15:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 36 2022-01-01 01:30:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 37 2022-01-01 01:45:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 38 2022-01-01 02:00:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 39 2022-01-01 02:15:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 40 2022-01-01 02:30:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 41 2022-01-01 02:45:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 42 2022-01-01 03:00:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 43 2022-01-01 03:15:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 44 2022-01-01 03:30:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 45 2022-01-01 03:45:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 46 2022-01-01 04:00:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 47 2022-01-01 04:15:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 48 2022-01-01 04:30:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 49 2022-01-01 04:45:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 50 2022-01-01 05:00:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 51 2022-01-01 05:15:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 52 2022-01-01 05:30:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 53 2022-01-01 05:45:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 54 2022-01-01 06:00:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 55 2022-01-01 06:15:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 56 2022-01-01 06:30:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 57 2022-01-01 06:45:00 Site B  10.3       10.3 NA          NA                  NA                 
    # 58 2022-01-01 07:00:00 Site B  10.3       NA   NA          NA                  NA