Player ranking dependent on number of played games

I have a df of players and their penalty points. It looks something like this:

# playerID  penalty 
#   A         0
#   B         2
#   A         4
#   C         2
#   A         0
#   B         0
#   B         2

I want to get a "player rank" which depends on the "penalty" and number of played games (A three games, C one game). If there is a player X with average penalty = 3 over 10 games and player Y with average penalty = 3 over 4 games then the "player rank" of the player X should be lower then the "player rank" of the player Y.

My idea was to divide the average penalty of player i by sqrt(n(i)), where n(i) is number of games played by player i. Second idea was to divide average penalty by log(n(i)+1).

Do you think its the right approach? Was somebody dealing with a similar problem?

Thanks for your responses.

Solution

I think you want to transfer this post to Cross Validated. Anyway, this is the solution to your problem via data.table.

library(data.table)
df = data.frame( playerID = c( "A", "B", "A", "C", "A", "B", "B"),  
                 penalty = c( 0, 2, 4, 2, 0, 0, 2 ) )

setDT(df)

df[ , games := .N, by = playerID ]
df[ , mean_penalty := mean( penalty ), by = playerID ]
df[ , `:=` ( rank1 = mean_penalty / sqrt(games),
             rank2 = mean_penalty /  log(games+1) ) ]
final = unique(df, by = "playerID" )

> final
   playerID penalty games mean_penalty     rank1     rank2
1:        A       0     3     1.333333 0.7698004 0.9617967
2:        B       2     3     1.333333 0.7698004 0.9617967
3:        C       2     1     2.000000 2.0000000 2.8853901