Search code examples
rdateiso8601date-conversion

Why does converting "year + week number + weekday number" always return January 25 for all weeks and weekdays when using %V?


I have a series of dates in the format year + week number + weekday number following ISO 8601 convention that I need to convert to dates. For some reason that I don't understand the conversion returns January 25 for all dates.

Example:

# first and last days of the first three weeks of 2025
# as defined in ISO 8601;
# first day of the week is Monday
dates <- as.Date(c("2024-12-30", "2025-01-05", "2025-01-06", "2025-01-12", "2025-01-13", "2025-01-19"))

# dates above formatted as
# week-based year, week of the year, weekday
# as defined in ISO 8601
weeks <- strftime(dates, "%G-%V-%u")
weeks
# [1] "2025-01-1" "2025-01-7" "2025-02-1" "2025-02-7" "2025-03-1" "2025-03-7"

# conversion to dates
# following the US convention
redates <- as.Date(weeks, "%Y-%U-%u")
redates
# [1] "2025-01-06" "2025-01-05" "2025-01-13" "2025-01-12" "2025-01-20" "2025-01-19"
# Correct calculation but wrong dates,
# because the conversion uses the
# wrong convention.

# conversion to dates
# following ISO 8601
redates <- as.Date(weeks, "%Y-%V-%u")
redates
# [1] "2025-01-25" "2025-01-25" "2025-01-25" "2025-01-25" "2025-01-25" "2025-01-25"
# Why are these all the same date?!?
# Using "%G-%V-%u" returns the same result.

How can I convert week numbers and weekday numbers into dates following ISO 8601 convention?


Note

This is an example to show what is happening. In my actual case the data I start with are week numbers and weekday numbers (the vector "weeks" in the example above). There is no back and forth conversion (I don't have the vector "dates").


Solution

  • This is explained in the docs:

    %G The week-based year (see %V) as a decimal number. (Accepted but ignored on input.)

    (Emphasis mine.)

    The week-based year %G format is supported for output, but not for input. When you try to use %Y instead, the problem is that %V doesn't have the year in the format it expects (i.e. weeks-based rather than calendar-based), so you end up with gibberish.

    Using the ISOweek package

    I'd prefer not to split on hyphens and write my own logic to parse the dates, as that seems error prone. One alternative would be to use the ISOweek::ISOweek2date() function, which can take a character vector of year, week, and weekday in format "%Y-W%V-%u".

    This means we have to add a "W" before the week number:

    sub("^(.{5})", "\\1W", weeks) |>
        ISOweek::ISOweek2date()
    # [1] "2024-12-30" "2025-01-05" "2025-01-06" "2025-01-12" "2025-01-13" "2025-01-19"
    

    base R approach

    Alternatively, if you want to minimise dependencies, we can write a function which takes the same basic approach as ISOweek2date(), which essentially relies on the fact that January 4th has to be in the first week of the year:

    get_dates <- function(weeks) {
        parts <- strsplit(weeks, "-") |>
            sapply(\(x) setNames(as.integer(x), c("year", "week", "weekday")))
    
        jan4 <- as.Date(paste0(parts["year", ], "-01-04"))
        jan4_weekday <- as.integer(format(jan4, "%u"))
        first_iso_monday <- jan4 - (jan4_weekday - 1)
    
        first_iso_monday + (parts["week", ] - 1) * 7 + (parts["weekday", ] - 1)
    }
    

    Here are some awkward dates near the beginning and end of the year to test it with:

    test_dates <- as.Date(c("2019-12-30", "2020-01-04", "2020-12-31", "2021-01-01", "2021-01-04", "2021-12-31", "2022-01-03", "2022-07-15", "2023-01-02", "2023-12-31", "2024-01-01", "2024-12-30", "2025-01-04", "2025-07-15"))
    

    Let's make sure that converting them to weeks format and back works:

    weeks <- strftime(test_dates, format = "%G-%V-%u")
    identical(
        test_dates,
        get_dates(weeks)
    )
    # [1] TRUE
    

    Although this seems to work it's very fiddly and I never really trust code like this. I'd strongly recommend using an existing function to do this for you unless that's not an option.