Search code examples
rcsvimportread.csv

R: Time values in some columns are truncated in their seconds part when using read_csv


I have a csv file with several time stamps. When I tried to import the file into RStudio, part of column is imported as strings like "2022/12/1 11:07" although the original data has section of seconds, i.e. "2022/12/1 11:07**:00**"

Additionally, other data in the same dataset are imported without this truncation.

This inconsistency makes my task a little complicated because I'm going to use lubridate::ymd_hms function for this data. I've also tried to import data with read.csv, which didn't make any difference.

Does anyone know how to avoid this phenomenon?

Thanks.

Here is dput of original csv (from clipboard). This is from the column which was imported including seconds.

structure(list(V1 = c("2022/9/8", "2022/9/8", "2022/9/8"), V2 = c("12:57", 
"13:00", "13:30")), class = "data.frame", row. Names = c(NA, -3L
))

And data below are from the column which drops seconds when imported by read_csv.

structure(list(V1 = c("2022/9/8", "2022/9/8", "2022/9/8"), V2 = c("12:57", 
"12:57", "12:57"), V3 = c("2022/9/9", "2022/9/9", "2022/9/9"), 
    V4 = c("10:35", "10:35", "10:35")), class = "data.frame", row.names = c(NA, 
-3L))

(It seems to have been separated cells at day and hour because of the space) These two kinds of data show different formats after importing to RStudio. Although it seems as if without seconds, I can see it has information of seconds (All are :00) in the formula bar of Excel.


Solution

  • If your data as loaded has hours and minutes, you can use lubridate::ymd_hm to make a datetime column (which will have seconds -- zero -- in each case).

    df1$timestamp1 = lubridate::ymd_hm(paste(df1$V1, df1$V2))
    
    
    df1
    #        V1    V2       V3    V4          timestamp1
    #1 2022/9/8 12:57 2022/9/9 10:35 2022-09-08 12:57:00
    #2 2022/9/8 12:57 2022/9/9 10:35 2022-09-08 12:57:00
    #3 2022/9/8 12:57 2022/9/9 10:35 2022-09-08 12:57:00