I need to rank my df by some columns. Have a look at the print below
The lines need to be ranked from 1 to 20 by the column df['pontos_na_rodada']
If we issue some ties - which will occur - they have to be resolved by the highest value in column df['saldo_gols']
. Then if the tie persist resolve it again by the column df['gols_feitos']
and lastly if we still have ties resolve it by column df['Red Cards']
and df['Yellow Cards']
where for these columns the lower value is the best.
Can someone give me a hand?
Example of the data in the image:
<bound method DataFrame.to_dict of league_season league_round fixture_id team.id resultado \
50885 2020 1.0 327986 118 3.0
46622 2020 1.0 327992 119 3.0
50863 2020 1.0 327986 120 0.0
60003 2020 1.0 327987 121 1.0
46637 2020 1.0 327991 123 3.0
46774 2020 1.0 327990 124 0.0
55991 2020 1.0 327994 126 3.0
46700 2020 1.0 327985 127 0.0
46730 2020 1.0 327988 128 1.0
46652 2020 1.0 327991 129 0.0
46758 2020 1.0 327990 130 3.0
50908 2020 1.0 327989 131 1.0
60024 2020 1.0 327987 133 1.0
46684 2020 1.0 327993 134 3.0
50931 2020 1.0 327989 144 1.0
46606 2020 1.0 327992 147 0.0
55970 2020 1.0 327994 151 0.0
46668 2020 1.0 327993 154 0.0
46743 2020 1.0 327988 794 1.0
46714 2020 1.0 327985 1062 3.0
gols_feitos saldo_gols Red Cards Yellow Cards pontos_na_rodada \
50885 2.0 1.0 0.0 3.0 3.0
46622 1.0 1.0 0.0 4.0 3.0
50863 1.0 -1.0 1.0 2.0 0.0
60003 1.0 0.0 0.0 1.0 1.0
46637 3.0 1.0 0.0 3.0 3.0
46774 0.0 -1.0 0.0 3.0 0.0
55991 3.0 3.0 0.0 NaN 3.0
46700 0.0 -1.0 0.0 3.0 0.0
46730 1.0 0.0 0.0 NaN 1.0
46652 2.0 -1.0 0.0 3.0 0.0
46758 1.0 1.0 0.0 2.0 3.0
50908 0.0 0.0 0.0 2.0 1.0
60024 1.0 0.0 0.0 1.0 1.0
46684 2.0 2.0 0.0 2.0 3.0
50931 0.0 0.0 0.0 NaN 1.0
46606 0.0 -1.0 0.0 3.0 0.0
55970 0.0 -3.0 0.0 3.0 0.0
46668 0.0 -2.0 1.0 3.0 0.0
46743 1.0 0.0 0.0 1.0 1.0
46714 1.0 1.0 0.0 2.0 3.0
rank
50885 NaN
46622 NaN
50863 NaN
60003 NaN
46637 NaN
46774 NaN
55991 NaN
46700 NaN
46730 NaN
46652 NaN
46758 NaN
50908 NaN
60024 NaN
46684 NaN
50931 NaN
46606 NaN
55970 NaN
46668 NaN
46743 NaN
46714 NaN >
I just figured out an answer to show here:
df['rank'] = np.nan
df['Red Cards'] = df['Red Cards']*-1
df['Yellow Cards'] = df['Yellow Cards']*-1
df['rank'] = df.sort_values(by = ['league_season','league_round','pontos_na_rodada',\
'saldo_gols','gols_feitos','Red Cards','Yellow Cards'])\
.groupby(['league_season','league_round']).cumcount(ascending=False)+1
df[(df['league_round']==10) & (df['league_season']==2020)].sort_values(by = 'rank')
The result:
league_season league_round fixture_id team.id resultado \
49809 2020 10.0 328084 119 0.0
50032 2020 10.0 328076 133 3.0
49919 2020 10.0 328079 1062 3.0
49671 2020 10.0 328078 126 1.0
49964 2020 10.0 328077 121 1.0
49855 2020 10.0 328083 127 0.0
49648 2020 10.0 328078 128 1.0
49694 2020 10.0 328080 130 1.0
49740 2020 10.0 328075 124 3.0
49832 2020 10.0 328083 129 3.0
49899 2020 10.0 328081 144 3.0
49717 2020 10.0 328080 154 1.0
49876 2020 10.0 328081 118 0.0
49602 2020 10.0 328082 134 3.0
49987 2020 10.0 328077 123 1.0
49763 2020 10.0 328075 131 0.0
50009 2020 10.0 328076 120 0.0
49786 2020 10.0 328084 151 3.0
49625 2020 10.0 328082 147 0.0
49942 2020 10.0 328079 794 0.0
gols_feitos saldo_gols Red Cards Yellow Cards pontos_na_rodada \
49809 0.0 -1.0 -0.0 -3.0 20.0
50032 3.0 1.0 -0.0 -2.0 18.0
49919 2.0 1.0 -0.0 -1.0 18.0
49671 2.0 0.0 -0.0 -2.0 18.0
49964 2.0 0.0 -1.0 -3.0 18.0
49855 0.0 -2.0 -0.0 NaN 17.0
49648 2.0 0.0 -0.0 -3.0 15.0
49694 1.0 0.0 -1.0 -1.0 15.0
49740 2.0 1.0 -1.0 -2.0 14.0
49832 2.0 2.0 -0.0 -1.0 13.0
49899 1.0 1.0 -0.0 -2.0 13.0
49717 1.0 0.0 -1.0 -2.0 12.0
49876 0.0 -1.0 -1.0 -2.0 12.0
49602 1.0 1.0 -0.0 -4.0 11.0
49987 2.0 0.0 -1.0 -3.0 11.0
49763 1.0 -1.0 -0.0 -4.0 10.0
50009 2.0 -1.0 -0.0 -2.0 9.0
49786 1.0 1.0 -1.0 -4.0 8.0
49625 0.0 -1.0 -1.0 -2.0 8.0
49942 1.0 -1.0 -0.0 -1.0 7.0
rank
49809 1
50032 2
49919 3
49671 4
49964 5
49855 6
49648 7
49694 8
49740 9
49832 10
49899 11
49717 12
49876 13
49602 14
49987 15
49763 16
50009 17
49786 18
49625 19
49942 20