[I am currently analyzing some price data and would like to know if it is possible to get the lowest rate per lane id and the corresponding name of the supplier who offered the cheapest rate in pandas dataframe]
data = [['Supplier 1', 'Lane 1', 'NL', 'DE', 200],
['Supplier 2', 'Lane 1', 'NL', 'DE', 150],
['Supplier 3', 'Lane 1', 'NL', 'DE', 300],
['Supplier 1', 'Lane 2', 'NL', 'DE', 200],
['Supplier 2', 'Lane 2', 'NL', 'DE', 105],
['Supplier 3', 'Lane 2', 'NL', 'DE', 100]]
columns = ['supplier_name', 'lane_id', 'origin', 'destination', 'quoted_rates']
df = pd.DataFrame(data, columns=columns)
df.sort_values(by='quoted_rates').groupby('lane_id',as_index=False).first()
Output:
lane_id supplier_name origin destination quoted_rates
0 Lane 1 Supplier 2 NL DE 150
1 Lane 2 Supplier 3 NL DE 100
In Addition:
new = df.sort_values(by='quoted_rates').groupby('lane_id',as_index=False).first()
field = 'supplier_name'
df['min_supplier'] = df.apply(lambda x:new[new.lane_id==x.lane_id][field].iloc[0],axis=1)
supplier_name lane_id origin destination quoted_rates min_supplier
0 Supplier 1 Lane 1 NL DE 200 Supplier 2
1 Supplier 2 Lane 1 NL DE 150 Supplier 2
2 Supplier 3 Lane 1 NL DE 300 Supplier 2
3 Supplier 1 Lane 2 NL DE 200 Supplier 3
4 Supplier 2 Lane 2 NL DE 105 Supplier 3
5 Supplier 3 Lane 2 NL DE 100 Supplier 3