Search code examples
rdplyrcoalesce

Is there a way to combine a start and endtime column to a single datetime column within R (with grouped data)


I have a dataset that contains the following columns:

starttime, endtime and ID

Is there a way to combine the starttime and endtime to a single datetime column whilst grouping by ID?

For example, I have this:

            StartTime                Endtime                ID

          12/18/2019 5:20:23 AM 12/18/2019 5:20:24 AM       A
          12/18/2019 2:01:40 PM 12/18/2019 2:01:47 PM       A

I would like this:

          DateTimeUTC                                      ID

          12/18/2019 5:20:23 AM                             A
          12/18/2019 5:20:24 AM                             A
          12/18/2019 2:01:40 PM                             A
          12/18/2019 2:01:47 PM                             A

I have tried this:

library(dplyr)

data %>%  group_by(ID) %>%  coalesce(Starttime, Endtime)

This command yields an error. I will further research this. Any help is appreciated. Thank you.


Solution

  • We can use pivot_longer from tidyr reshape the data into 'long' format. coalesce is used when there are multiple columns with missing values (NA) and want to merge into a single column which returns the first non-NA column value in each row (if there are no non-NA, it returns NA)

     library(dplyr)
     library(tidyr)
     data %>%
         pivot_longer(cols = -ID, values_to = "DateTimeUTC") %>%
         select(DateTimeUTC, ID)
    # A tibble: 4 x 2
    #  DateTimeUTC           ID   
    #  <chr>                 <chr>
    #1 12/18/2019 5:20:23 AM A    
    #2 12/18/2019 5:20:24 AM A    
    #3 12/18/2019 2:01:40 PM A    
    #4 12/18/2019 2:01:47 PM A   
    

    data

    data <- structure(list(StartTime = c("12/18/2019 5:20:23 AM", "12/18/2019 2:01:40 PM"
    ), Endtime = c("12/18/2019 5:20:24 AM", "12/18/2019 2:01:47 PM"
    ), ID = c("A", "A")), class = "data.frame", row.names = c(NA, 
    -2L))