Search code examples
pythonpandasdatetimestrftime

Pandas to_datetime %Y%U to %Y%U%w with 53 week years


Say you have dates in the format of %Y%U, such as 202101 (year 2021, week 1) and you'd like to to use pd.to_datetime to convert that to a conventional date, such as 2021-01-04

I'm trying to see if there's a better way to handle years w/53 weeks. Consider the following:

from datetime import date
import pandas as pd

df = pd.DataFrame({'week':['202053','202101']})

Here is what I want

print(date.fromisocalendar(2020,53,1), date.fromisocalendar(2021,1,1))

2020-12-28 2021-01-04

This is what pandas will do, which I believe is due to pandas wanting week 53 to be considered week 0 of the next year

print(pd.to_datetime(df.week.astype(str)+'1', format='%Y%W%w').tolist())

[Timestamp('2021-01-04 00:00:00'), Timestamp('2021-01-04 00:00:00')]

So I can do

df['week'] = np.where(df['week'].str.endswith('53'), (df['week'].astype(int)+47).astype(str),df['week'])
print(pd.to_datetime(df.week.astype(str)+'1', format='%Y%W%w').tolist())

[Timestamp('2020-12-28 00:00:00'), Timestamp('2021-01-04 00:00:00')]

So I can add 47 weeks to any date that ends in 53, and I get what I'm expecting. This feels like a very roundabout way to achieve this.

Is there are more standard way to handle this?


Solution

  • Since you use date.fromisocalendar(2020,53,1), so you can try

    df['week_'] = pd.to_datetime(df['week']+'1', format='%G%V%w')
    

    where

    • %G: ISO 8601 year with century representing the year that contains the greater part of the ISO week (%V).
    • %V: ISO 8601 week as a decimal number with Monday as the first day of the week. Week 01 is the week containing Jan 4.
    • %w: Weekday as a decimal number, where 0 is Sunday and 6 is Saturday.
    print(df)
    
         week      week_
    0  202053 2020-12-28
    1  202101 2021-01-04