Search code examples
rdataframemeansummarize

Summarize(mean) across records keeping the variables that don't change


I have a dataframe that contains records from various devices that measure parameters like temperature and humidity, I'm trying to group the records in intervals of 10 mins. Example in question:

id    datetime               hum     temp    room   
<chr> <S3: POSIXct>          <dbl>   <dbl>   <chr>  
AA    2021-11-26 18:49:34    31      24      living room
AA    2021-11-26 18:54:34    29      26      living room
BB    2021-11-26 18:49:34    31      24      bathroom
BB    2021-11-26 18:54:34    33      23      bathroom

My code is:

test %>% 
    group_by(id, datetime = cut(datetime, "10 min")) %>%
    summarise(across(hum:temp, ~ mean(.x)))

How can I keep the room variable (and others that aren't in this example too) while summarising the other variables?

Wanted result:

id    datetime               hum     temp    room   
<chr> <S3: POSIXct>          <dbl>   <dbl>   <chr>  
AA    2021-11-26 18:49:00    30      25      living room
BB    2021-11-26 18:49:00    32      23.5    bathroom

My only idea is to remove the other variables before and then joining them back, but I thought there could be an easier way.


Solution

  • Do you mean this: just add room or whatever to the group_by line:

    df %>% 
      mutate(datetime = as.POSIXct(datetime)) %>% # This you may not need
      group_by(id, datetime = cut(datetime, "10 min"), room) %>%
      summarise(across(hum:temp, ~ mean(.x)), .groups = "keep")
    
     id    datetime            room          hum  temp
      <chr> <fct>               <chr>       <dbl> <dbl>
    1 AA    2021-11-26 18:49:00 living room    30  25  
    2 BB    2021-11-26 18:49:00 bathroom       32  23.5