Search code examples
rfiltertimesum

How to calculate the total minutes when values were greater than 5?


I have some time series data where values are given every 15 minutes. For this dataset we can assume that the values do not change within that 15 minute period; therefore if the value was 1 at 8:15am, and 2 at 8:30am, then we assume that from 8:15-8:30 the value was 1, and from 8:30-8:45 the value was 2.

Now I want to calculate the monthly total of minutes when values were greater than 5.

My dataframe looks something like this (Except with 15+ columns of values).

# Create a, b, c, d variables
a <- c(
  "06-25-2021 08:00:00 AM",
  "06-25-2021 08:15:00 AM",
  "06-25-2021 08:30:00 AM",
  "06-25-2021 08:45:00 AM",
  "07-25-2021 08:45:00 AM",
  "07-25-2021 09:00:00 AM",
  "08-25-2021 08:45:00 AM",
  "08-25-2021 09:00:00 AM",
  "09-25-2021 09:15:00 AM",
  "09-25-2021 09:30:00 AM"
)
b = c(4, 5, 8, NA, 4, 5, NA, 7, 7, 6)
c = c(6, 10, 8, NA, 8, 5, NA, 8, 7, 2)
d = c(1, 3 ,NA, 6, 4, 8, 2, 4, NA, 10)

df =
  tibble(a, b, c, d) 

df$a = as.POSIXlt(df$a, format = "%m-%d-%Y%H:%M:%S", tz = 'EST')

but I want it to look like this

Name = c("b", "c", "d")
June = c(15, 45, 15 )
Jul = c(NA, 15, 15)
Aug = c(15, 15, NA)
Sept = c(45, 30, 15)

df_2 = tibble (Name,June,Jul,Aug,Sept)

I'm not sure how to sum and filter for this when it's a time series. Anyone have any suggestions?


Solution

  • Based on the description - perhaps we replace the values in columns 'b' to 'd' that are less than or equal to 5 will NA, reshape to 'long' format with pivot_longer, get the month from the 'a' column and reshape back to 'wide' with pivot_wider

    library(dplyr)
    library(tidyr)
    df %>% 
       mutate(across(2:4, ~ replace(., . <=5, NA))) %>% 
       pivot_longer(cols = b:d, names_to = 'Name', values_drop_na = TRUE) %>%
       mutate(a = format(a, '%b')) %>%
       pivot_wider(names_from = a, values_from = value, 
        values_fill = 0, values_fn = list(value = function(x) length(x) * 15))