I have a pandas dataframe with data like:
+-----------+-----------------+---------+
| JOB-NAME | Status | SLA |
+-----------+-----------------+---------+
| job_1 | YET_TO_START | --- |
| job_3 | COMPLETED | MET |
| job_4 | RUNNING | MET |
| job_2 | YET_TO_START | LATE |
| job_6 | RUNNING | LATE |
| job_5 | FAILED | LATE |
| job_7 | YET_TO_START | --- |
| job_8 | COMPLETED | NOT_MET |
+-----------+-----------------+---------+
I need to sort this table based on the Status and SLA states, like for Status: FAILED will be top on the table, then YET_TO_START, then RUNNING, and finally COMPLETED. Similarly for SLA the order will be LATE, ---, NOT_MET, and MET. Like this:
+-----------+-----------------+---------+
| JOB-NAME | Status | SLA |
+-----------+-----------------+---------+
| job_5 | FAILED | LATE |
| job_2 | YET_TO_START | LATE |
| job_1 | YET_TO_START | --- |
| job_7 | YET_TO_START | --- |
| job_6 | RUNNING | LATE |
| job_4 | RUNNING | MET |
| job_8 | COMPLETED | NOT_MET |
| job_3 | COMPLETED | MET |
+-----------+-----------------+---------+
I am able to do this custom sorting priority-based only on single column Status, but unable to do for multiple columns.
sort_order_dict = {"FAILED":0, "YET_TO_START":1, "RUNNING":2, "COMPLETED":3}
joined_df = joined_df.sort_values(by=['status'], key=lambda x: x.map(sort_order_dict))
A solution is given here, but its for single column, not multiple column.
You can extend dictionary by values from another columns, only necessary different keys in both columns for correct working like mentioned mozway in comments:
sort_order_dict = {"FAILED":0, "YET_TO_START":1, "RUNNING":2, "COMPLETED":3,
"LATE":4, "---":5, "NOT_MET":6, "MET":7}
df = df.sort_values(by=['Status','SLA'], key=lambda x: x.map(sort_order_dict))
print (df)
JOB-NAME Status SLA
5 job_5 FAILED LATE
3 job_2 YET_TO_START LATE
0 job_1 YET_TO_START ---
6 job_7 YET_TO_START ---
4 job_6 RUNNING LATE
2 job_4 RUNNING MET
7 job_8 COMPLETED NOT_MET
1 job_3 COMPLETED MET
Or use ordered Categorical
:
df['Status'] = pd.Categorical(df['Status'], ordered=True,
categories=['FAILED', 'YET_TO_START', 'RUNNING', 'COMPLETED'])
df['SLA'] = pd.Categorical(df['SLA'], ordered=True,
categories= ['LATE', '---', 'NOT_MET', 'MET'])
df = df.sort_values(by=['Status','SLA'])
print (df)
JOB-NAME Status SLA
5 job_5 FAILED LATE
3 job_2 YET_TO_START LATE
0 job_1 YET_TO_START ---
6 job_7 YET_TO_START ---
4 job_6 RUNNING LATE
2 job_4 RUNNING MET
7 job_8 COMPLETED NOT_MET
1 job_3 COMPLETED MET