Search code examples
rdatelubridatetsibble

Tsibble year week function returns in undesired behaviour


I try to convert a date column with missing values into year-week format with tsibble. Doing this returns either in parsing failures or numeric outcomes.

The final output shall look like this: 2020 W1, 2020 W6.

Is there a way to solve this using the tsibble package? Strangly, the yearmonth function of this package works also with NA and empty values.

library(tsibble)
library(lubridate)

x <- c("2020-01-01", "", "2020-02-06")

yearweek(x)
# Output: fails

ifelse(x == "", "", yearweek(ymd(x)))
# Output: "18260" "" "18295"

# Desired Output: "2020 W1", "", "2020 W6".

Solution

  • You can do this in base R :

    format(as.Date(x), "%Y W%V")
    #[1] "2020 W01" NA         "2020 W06"
    

    When you use this in yearweek from tsibble it returns an error

    library(tsibble)
    yearweek(as.Date(x))
    

    Error in yrs[mth_wk == "12_01"] <- yr[mth_wk == "12_01"] + 1 : NAs are not allowed in subscripted assignments

    Since yearweek function returns output of class "yearweek" "Date" it doesn't allow NA values in it or empty values ("") because they are not of the class "yearweek" "Date". A hack is to change the values to character instead.

    x1 <- as.Date(x)
    inds <- !is.na(x1)
    x2 <- character(length = length(x1))
    x2[inds] <- as.character(yearweek(x1[inds]))
    x2
    #[1] "2020 W01" ""         "2020 W06"