I have a DataFrame
where I want to populate a column which currently has a date only in the first row, all other rows in the column are blank (None
).
Using the value in index row 0
of df['date']
column, I have written a loop that gives me a date in each subsequent row which is 7 days previous of the date in the previous row.
import numpy as pd
first_date = pd.to_datetime(df['date'].loc[0])
date_list = [first_date]
count = 0
while count < (len(df) -1):
count +=1
print(count)
subtract_days = count * 7
latest_date = first_date - timedelta(days=subtract_days)
print(latest_date)
date_list.append(latest_date)
print(date_list)
df['date_1'] = date_list
However, this requires creating a new column df['date_1'], deleting the existing df['date'] and renaming the new.
What would be a more efficient way of doing the same?
You can try to use pd.date_range
:
# set your date column as index
df.set_index('date', inplace=True)
# generate dates for 7 days descending for periods equal to length of the dataframe
df.index = pd.date_range(start=df.index[0], freq='-7d', periods=df.shape[0])
This can be done without setting as an index as well.
df['date'] = pd.date_range(start=df.iloc[0]['date'], freq='-7d', periods=df.shape[0])