I have a dataframe with two columns Date_of_journey
and Price
. The column Date_of_journey
takes values between 1 and 119 but it has only 37 rows. So a lot of dates are missing.
Is there a simple way to add those dates where the price is somewhere in between the previous and next row?
Here is a plot of the data to give you an idea. I would like to add a row with Date_of_journey=4
and 5
with a Price that fits the gray curve.
You could resample your pd.DataFrame
to a new range using RangeIndex()
and interpolate between the known values using pd.interpolate(method='linear')
. With more data you 'll get a plot similar to yours.
import pandas as pd
import io
data = """Date_of_Journey Price
1 24089.333333
3 14873.397727
6 14035.232877
9 13178.641509
15 5785.500000"""
df = pd.read_csv(io.StringIO(data), delimiter='\t', index_col='Date_of_Journey')
df = df.reindex(pd.RangeIndex(start=1, stop=119,step=1))
df.interpolate(method='linear', inplace=True)
df.plot(y='Price')