Search code examples
pythonpandasdataframedatetimedata-cleaning

Creating day of the week column from existing date time column in Pandas dataframe with Python


Hi StackOverflow community,

I have a data frame consisting of datetime dates from 2010-01-04 to 2020-01-01. Here is a small random snippet:

28  2010-01-04  0.70930     LVL
29  2010-01-04  1.58850     AUD
30  2010-01-04  26.28500    CZK
31  2010-01-04  1.49530     CAD
32  2010-01-04  11.16080    HKD
33  2010-01-05  1645.74000  KRW
34  2010-01-05  0.90045     GBP
35  2010-01-05  15.64660    EEK
36  2010-01-05  18.48330    MXN
37  2010-01-05  1.44420     USD
38  2010-01-05  10.50690    ZAR

How can I create a new "Day of the Week" column for each observation? For example,

28  2010-01-04  0.70930     LVL    Monday
29  2010-01-04  1.58850     AUD    Monday
30  2010-01-04  26.28500    CZK    Monday
31  2010-01-04  1.49530     CAD    Monday
32  2010-01-04  11.16080    HKD    Monday
33  2010-01-05  1645.74000  KRW    Tuesday
34  2010-01-05  0.90045     GBP    Tuesday
35  2010-01-05  15.64660    EEK    Tuesday
36  2010-01-05  18.48330    MXN    Tuesday
37  2010-01-05  1.44420     USD    Tuesday
38  2010-01-05  10.50690    ZA     Tuesday

Thank you all for any guidance.

Tony

Here is the code to my project:

import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as sp

start_date = "2010-01-01"
end_date = "2020-01-01"
symbol = "GBP"

resp = requests.get('https://api.exchangeratesapi.io/history?start_at={}&end_at={}&base=EUR'.format(start_date, end_date))

if resp.status_code != 200:
    # This means something went wrong.
    raise ApiError('GET /tasks/ {}'.format(resp.status_code))

data = resp.json()
df2 = json_normalize(data)
df2.drop(["start_at","base","end_at"],axis=1,inplace=True)
df2.columns = [col.replace('rates.', '') for col in df2.columns]
df3 = df2.stack().reset_index().drop('level_0',axis=1)
df3.columns = ['Date', 'FXRateEUR']
df3['Date'], df3['Symbol'] = df3['Date'].str.split('.', 1).str
df3['Date'] = df3['Date'].astype(str)  + " " +  '00:00:00'
df3['Date'] =  pd.to_datetime(df3['Date'])
df3 = df3.sort_values(by='Date').reset_index(drop=True)
df3.head(50)

Solution

  • you can use day_name

    df3['Day of the Week'] = pd.to_datetime(df3['Date']).dt.day_name()