I'm working with some time series data, and I'm trying to standardise the dates relative to a specific date, so as to better highlight the 'before and after' element.
For example, using the following randomly-generated data:
import numpy as np
import pandas as pd
from datetime import date, timedelta
import matplotlib.pyplot as plt
np.random.seed(0); randnums = np.random.randint(1,101,31)
sdate = date(2023,3,1) # start date
edate = date(2023,4,1) # end date
datelist = list(pd.date_range(sdate,edate-timedelta(days=1),freq='d'))
df = pd.DataFrame(zip(datelist,randnums),columns=['date','data'])
plt.subplots(figsize=(12, 12))
plt.plot(df['date'], df['data'])
This would generate the following plot
I'm trying to set up a code that would use a specified date (say, 2023-03-15) as a reference point, and have all other dates be defined relative to that one. Using the graph above, this would mean that 2023-03-17 would appear on the x-axis as "t2" (since 17 March 2023 is two days after the reference date of 15 March 2023), 2023-03-21 would appear as "t6", 2023-03-13 would appear as "t-2", 2023-03-09 would appear as "t-6", and so forth.
I've been trying all sorts of things and looking around but cannot find a way to do this. Would you have any ideas?
Thank you very much in advance!
you can use df.apply()
to pass a function that calculates that for you:
SELECTED_DATE = pd.Timestamp(2023,3,15) # selected date for relative
def get_relative_day(date:"pd.Timestamp"):
delta = date - SELECTED_DATE
relative_day = f"t{delta.days}"
return relative_day
df['relative_day'] = df['date'].apply(get_relative_day)
# in plots
plt.plot(df['relative_day'], df['data']) # change column
output: