I want to calculate the maximum difference of product_mrp
according to the dates.
For that I was trying to group by date but not able to get after that.
INPUT:
+-------------+--------------------+
| product_mrp | order_date |
+-------------+--------------------+
| 142 | 01-12-2019 |
| 20 | 01-12-2019 |
| 20 | 01-12-2019 |
| 120 | 01-12-2019 |
| 30 | 03-12-2019 |
| 20 | 03-12-2019 |
| 45 | 03-12-2019 |
| 215 | 03-12-2019 |
| 15 | 03-12-2019 |
| 25 | 07-12-2019 |
| 5 | 07-12-2019 |
+-------------+--------------------+
EXPECTED OUTPUT:
+-------------+--------------------+
| product_mrp | order_date |
+-------------+--------------------+
| 122 | 01-12-2019 |
| 200 | 03-12-2019 |
| 20 | 07-12-2019 |
+-------------+--------------------+
Use pandas
to load the data, then use groupby
to group by the shared index:
import pandas as pd
dates = ['01-12-2019']*4 + ['03-12-2019']*5 + ['07-12-2019']*2
data = [142,20,20,120,30,20,45,215,15,25,5]
df = pd.DataFrame(data,)
df.index = pd.DatetimeIndex(dates)
grouped = df.groupby(df.index).apply(lambda x: x.max()-x.min())
Output:
product mrp
2019-01-12 122
2019-03-12 200
2019-07-12 20