Search code examples
rdatetimestrsplit

How to split Monday, July 1, 2019 12:00:00:000 AM


I have read, studied, and tested, but I'm just not getting it. Here is my data frame:

                               MyDate TEMP1 TEMP2
Monday, July 1, 2019 12:00:00:000 AM  90.0  1586

Monday, July 1, 2019 12:01:00:000 AM  88.6  1581

Monday, July 1, 2019 12:02:00:000 AM  89.4  1591

Monday, July 1, 2019 12:03:00:000 AM  90.5  1586

I need to compare it to a second data frame:

   Date     Time A.B.Flow A.B.Batch.Volume

7/1/2019 14:47:46      1.0              2.0

7/9/2019 14:47:48      3.0              5.0

7/11/2019 14:47:52      0.0              2.0

7/17/2019 14:48:52      3.8              4.0

7/24/2019 14:49:52      0.0              3.1

I just have to combine the two data frames when the minutes dates, hours, and minutes match. The seconds do not have to match.

So far I have gleaned that I need to convert the first Column MyDate into separate Dates and Times. I've been unable to come up with a strsplit command that actually does this.

This just gives each element in quotes:

Tried, newdate <- strsplit(testdate$MyDate, "\\s+ ")[[3]]    

This is better but "2019"is gone:

Tried, newdate <- strsplit(testdate$MyDate, "2019")           

It looks like this:

[1] "Monday, July 1, " "12:00:00:000 AM" 

[[2]]
[1] "Monday, July 1, " "12:01:00:000 AM" 

[[3]]
[1] "Monday, July 1, " "12:02:00:000 AM" 

[[4]]
[1] "Monday, July 1, " "12:03:00:000 AM" 

Please tell me what I am doing wrong. I would love some input as to whether I am barking up the wrong tree.

I've tried a few other things using anytime and lubridate, but I keep coming back to this combined date and time with the day written out as my nemesis.


Solution

  • You could get rid of the day (Monday, ...) in your MyDate field by splitting on ',', removing the first element, then combining the rest and converting to POSIXCt.

    Assuming your first dataframe is called df:

    dt <- strsplit(df$MyDate, ',')
    df$MyDate2 <- sapply(dt, function(x) trimws(paste0(x[-1], collapse = ',')))
    df$MyDate2 <- as.POSIXct(df$MyDate2, format = '%b %d, %Y %H:%M:%S')
    

    And since you are not interested in the seconds part of the timestamps, you can do:

    df$MyDate2 <- format(df$MyDate2, '%Y-%m-%d %H:%M')
    

    You should similarly convert the Date/Time fields of your second dataframe df2, creating a MyDate2 field there with the seconds part removed as above.

    Now you can merge the two dataframes on the MyDate2 column.