I have a dataframe containing materials, dates of purchase and purchase prices. I want to filter my dataframe such that I only keep one row containing each material, and that row contains the material at the latest purchase date and corresponding price.
How could I achieve this? I have racked my brains trying to work out how to apply aggregation functions to this but I just can't work out how.
Do a multisort and then use drop duplicates, keeping the first occurrence.
import pandas as pd
df.sort_values(by=['materials', 'purchase_date'], ascending=[True, False], inplace=True)
df.drop_duplicates(subset=['materials'], keep='first', inplace=True)