Search code examples
rdatetidyverselubridatetibble

Date parsed with parse_date_time() from lubridate loses its format when merged with tibble


I'm scraping a website and trying to format certain text as a date. The scraping and format code works correctly, but when I try to merge the date column in with the rest of the data, it loses its date format and type. I'm at a loss of how to fix it and can't find any relevant information.

Reproducible example:

library(tidyverse)
library(lubridate)

test_dates <- c("Sunday, Nov 15, 2020", "Monday, Nov 16, 2020", "Thursday, Nov 19, 2020", "Sunday, Nov 22, 2020", "Monday, Nov 23, 2020"  )
dates <- parse_date_time(test_dates, "AbdY")
glimpse(dates)

x1 <- rnorm(5)
x2 <- rnorm(5)

mydata <- as_tibble(cbind(dates, x1, x2))
glimpse(mydata)

mydata <- as_tibble(cbind(as_date(dates), x1, x2))
glimpse(mydata)

This gives the following output:

> library(tidyverse)
> library(lubridate)
> test_dates <- c("Sunday, Nov 15, 2020", "Monday, Nov 16, 2020", "Thursday, Nov 19, 2020", "Sunday, Nov 22, 2020", "Monday, Nov 23, 2020"  )

> dates <- parse_date_time(test_dates, "AbdY")
> glimpse(dates)
 POSIXct[1:5], format: "2020-11-15" "2020-11-16" "2020-11-19" "2020-11-22" "2020-11-23"

> x1 <- rnorm(5)
> x2 <- rnorm(5)

> mydata <- as_tibble(cbind(dates, x1, x2))
> glimpse(mydata)
Rows: 5
Columns: 3
$ dates <dbl> 1605398400, 1605484800, 1605744000, 1606003200, 1606089600
$ x1    <dbl> -0.1142434, -0.1638176, -0.8392169, 1.2231866, -1.3134138
$ x2    <dbl> -0.6944343, -0.2210215, 1.0754251, -0.4685189, -0.2033346

> mydata <- as_tibble(cbind(as_date(dates), x1, x2))
> glimpse(mydata)
Rows: 5
Columns: 3
$ V1 <dbl> 18581, 18582, 18585, 18588, 18589
$ x1 <dbl> -0.1142434, -0.1638176, -0.8392169, 1.2231866, -1.3134138
$ x2 <dbl> -0.6944343, -0.2210215, 1.0754251, -0.4685189, -0.2033346

As you can see at the end, the variable reverts to a double rather than maintaining the previous POSIXct or any other date-related format. Any suggestion?


Solution

  • We can use tibble or data.frame directly on the dataset

    tibble(dates, x1, x2)
    

    Or

    data.frame(dates, x1, x2)
    

    instead of cbind as cbind converts to matrix and matrix can have only a single type. By default, the method dispatched by cbind is cbind.matrix. If we want to use cbind, use the cbind.data.frame

    cbind.data.frame(dates, x1, x2)