I have dates in excel format as dataframe as shown below.
df = pd.DataFrame([44562.24909722, 44562.27255787, 44562.28564815],columns='orderdate'])
print(df)
Out[255]:
orderdate
0 44562.249097
1 44562.272558
2 44562.285648
Then, I convert this excel date to real date using python function
df['date'] = pd.to_datetime(df.orderdate, unit='D', origin='1899-12-30')
Out[258]:
orderdate date
0 44562.249097 2022-01-01 05:58:41.999808256
1 44562.272558 2022-01-01 06:32:28.999967744
2 44562.285648 2022-01-01 06:51:20.000159744
Now, I have to extract week numbers from the following date column. I use: #code
df['weekno'] = df.date.dt.isocalendar().week
orderdate date weekno
0 44562.249097 2022-01-01 05:58:41.999808256 52
1 44562.272558 2022-01-01 06:32:28.999967744 52
2 44562.285648 2022-01-01 06:51:20.000159744 52
So, in the output, I get weekno = 52 for 1 January 2022. I understand this is a isocalendar format. But I want my week number 1 to start from 1 January every year. How to do that ?
Try using strftime like this:
from datetime import datetime
df = pd.DataFrame([44562.24909722, 44562.27255787, 44562.28564815],columns=['orderdate'])
df['date'] = pd.to_datetime(df.orderdate, unit='D', origin='1899-12-30')
# This line is new
df['week_of_year'] = df["date"].apply(lambda x: int(x.strftime("%W")))
print(df)
Output
orderdate date week_of_year
0 44562.249097 2022-01-01 05:58:41.999808256 0
1 44562.272558 2022-01-01 06:32:28.999967744 0
2 44562.285648 2022-01-01 06:51:20.000159744 0
This will give you the week starting at Monday, and the first week of the year is 0
.
Now, if you absolutely need the first week to start at 1, simply add 1 :)
df['week_of_year'] = df["date"].apply(lambda x: int(x.strftime("%W"))+1)
Output:
orderdate date week_of_year
0 44562.249097 2022-01-01 05:58:41.999808256 1
1 44562.272558 2022-01-01 06:32:28.999967744 1
2 44562.285648 2022-01-01 06:51:20.000159744 1
P.S. You can check this useful tutorial on strftime
. I find it a much more flexible tool for parsing dates than panda
's included one.