I am installing a ranking system and basically I have a field called site_fees that accounts for 10% of the total for consideration. A site fee of 0 would get all 10 points. What I want to do is calculate how many points the non-zero fields would get, but I am struggling to do so.
My initial approach was to split the dataframe into 2 dataframes (dfb where site_fees are 0 and dfa where they are > 0) and calculate the average for dfa, assign the rating for dfb as 10, then union the two.
The code is as follows:
dfSitesa = dfSites[dfSites['site_fees'].notnull()]
dfSitesb = dfSites[dfSites['site_fees'].isnull()]
dfSitesa['rating'] = FeeWeight * \
dfSitesa['site_fees'].min()/dfSitesa['site_fees']
dfSitesb['rating'] = FeeWeight
dfSites = pd.concat([dfSitesa,dfSitesb])
This produces an output, however the results of dfa are not correct because the minimum of dfa is 5000 instead of 0, so the rating of a site with $5000 in fees is 10 (the maximum, not correct). What am I doing wrong?
The minimum non-zero site_fee is 5000 and the maximum is 15000. Based on this, I would expect a general ranking system like:
15000 | 0
10000 | 3.3
5000 | 6.6
0 | 10
Here is a way to do it :
dfSites = pd.DataFrame({'site_fees':[0,1,2,3,5]})
FeeWeight = 10
dfSitesa = dfSites[dfSites['site_fees'].notnull()]
dfSitesb = dfSites[dfSites['site_fees'].isnull()]
dfSitesb['rating'] = FeeWeight
factor = (dfSitesa['site_fees'].max() - dfSitesa['site_fees'].min())
dfSitesa['rating'] = FeeWeight * ( 1 - ( (dfSitesa['site_fees'] - dfSitesa['site_fees'].min()) / factor) )
dfSites = pd.concat([dfSitesa,dfSitesb])
In [1] : print(dfSites)
Out[1] :
site_fees rating
0 0 10.0
1 1 8.0
2 2 6.0
3 3 4.0
4 5 0.0