I have slice of my dataframe like this and I want to sort values by person and buy date, then all other columns are sorted ascending automatically.
df = pd.DataFrame({'person': [1, 1, 1, 2, 3, 3],
'brand': ['b', 'a', 'a', 'b', 'a', 'b'],
'product': ['A1', 'A1', 'A2', 'B1', 'C1', 'C2'],
'buydate': ['20220101', '20220101', '20220401', '20220601', '20220630', '20221201'],
'type': ['2', '1', '1', '1', '1', '2'],
'price': [50, 20, 200, 300, 20, 150],})
and I expected:
person | brand | product | buydate | type | price | |
---|---|---|---|---|---|---|
0 | 1 | a | A1 | 20220101 | 1 | 20 |
1 | 1 | b | A1 | 20220101 | 2 | 50 |
2 | 1 | a | A2 | 20220401 | 1 | 200 |
3 | 2 | b | B1 | 20220601 | 1 | 300 |
4 | 3 | a | C1 | 20220630 | 1 | 20 |
5 | 3 | b | C2 | 20221201 | 2 | 150 |
I have tried
df = df.sort_values(['person', 'buydate'])
but I only got:
person | brand | product | buydate | type | price | |
---|---|---|---|---|---|---|
0 | 1 | b | A1 | 20220101 | 2 | 50 |
1 | 1 | a | A1 | 20220101 | 1 | 20 |
2 | 1 | a | A2 | 20220401 | 1 | 200 |
3 | 2 | b | B1 | 20220601 | 1 | 300 |
4 | 3 | a | C1 | 20220630 | 1 | 20 |
5 | 3 | b | C2 | 20221201 | 2 | 150 |
My dataset is pretty big and contains a lot of columns, so it is highly appreciated if any solutions with short runtime. While using lambda is required too much time, any solutions using lambda is also welcomed!
Thank you so much!
You can craft the sorting list using Index.difference
:
cols = ['person', 'brand']
out = df.sort_values(by=cols+list(df.columns.difference(cols)))
If you want to use a custom order with a default of ascending for the other columns:
cols = ['person', 'brand']
out = df.sort_values(by=cols+(l:=list(df.columns.difference(cols))),
ascending=[True, True]+[True]*len(l)
)
# or for python <3.8 that doesn't support the walrus operator
# cols = ['person', 'brand']
# order = [True, True]
# default_order = True
# out = df.sort_values(by=cols+list(df.columns.difference(cols)),
# ascending=order+[default_order]*len(set(df)-set(cols))
# )
Output:
person brand product buydate type price
1 1 a A1 20220101 1 20
2 1 a A2 20220401 1 200
0 1 b A1 20220101 2 50
3 2 b B1 20220601 1 300
4 3 a C1 20220630 1 20
5 3 b C2 20221201 2 150