New to Python and looking for some help building a function - I have searched other answers but haven't found exactly what I'm looking for (please redirect as necessary if this is a repeat query).
I am working with Pandas and I have a Dataframe below containing two columns of rankings:
I have 3 copies of this dataframe all in the same format but with different values. Each contains two ranking columns (rank_ctb in col1 and rank_score in col2).
I want to build a function, in which I can pass the name of the dataframe, and it adds the 5 IDs (the index column), for the 5 highest rankings in col 1, into a list, and the 5 highest rankings in col 2, to another list.
So in this data example, col 1 is already sorted on the ranking, the list would contain the values:
#5 highest rankings from RANK_CTB
List_One = [Test_Data_1, Test_Data_9, Test_Data_19, Test_Data_5, Test_Data_8]
#5 highest rankings from RANK_SCORE (this column is not sorted, and 3rd and 5th ranks aren't visible in my example data)
List_Two = [Test_Data_8, Test_Data_22, some_other_ID, Test_Data_26, some_other_ID2]
My initial thoughts are I need to use a for loop and set two empty lists, but from there I'm completely stuck.
I am writing a test function for you, I think this will do the job. Modify it a little as per your need
def test(df):
list_one = []
list_two = []
col1_highest = sorted(list(df.RANK_CTB), reverse = True)[:5]
col2_highest = sorted(list(df.RANK_SCORE), reverse = True)[:5]
for i range(len(col1_highest)):
list_one.append(df.loc[df.RANK_CTB == col1_highest[i], 'ID'])
list_two.append(df.loc[df.RANK_SCORE == col2_highest[i], 'ID'])
return list_one, list_two
list_one, list_two = test(name_of_df)