Search code examples
rfunctiondataframepercentagesummary

Write Function to Calculate Percentage and Place it in a New Column of the Data Frame


I have a data frame where an athlete receives a performance rating of "Good", "Fair" and "Poor".

I would like to write a function that does the following:

Produces a new data frame that contains The name of the athlete The percentage of times the athlete received a "Good" rating

Player <- c("Jordan", "Jordan", "Jordan", "Jordan", "Jordan", "Jordan", 
"Jordan","Jordan","Jordan", "Barkley", "Barkley", "Barkley", "Barkley", 
"Barkley", "Olajuwon", "Olajuwon", "Olajuwon", "Olajuwon", "Olajuwon", 
"Kemp", "Kemp", "Kemp", "Kemp", "Kemp", "Kemp")

Rating <- c("Good", "Fair", "Good", "Good", "Good", "Poor", "Good", "Good",  
"Good", "Fair", "Fair", "Poor", "Good", "Good", "Good", "Fair", "Good", 
"Fair", "Good", "Good", "Good", "Good", "Good", "Good", "Poor")

df <- data.frame(Player, Rating)

I would want output that is:

Player    PercentGood
Jordan    77.8%
Barkley   40.0%
Olajuwon  60.0%
Kemp      83.3%

When I receive the file the percent is not included so I want to run this each time an updated file is sent to me.

So the file is sent, I apply code and a new data frame is produced that gives me a summary of the percentage that an athlete received a rating of "Good"

Thank you.


Solution

  • Here's a tidyverse solution using scales::percent to format for percentage.

    It first makes a new variable good or not encoded as 1 or 0. Then the percentage of 1s is calculated for each player.

    library(tidyverse)
    library(scales)
    df %>% mutate(good = ifelse(Rating == "Good", 1, 0)) %>% 
      group_by(Player = fct_inorder(Player)) %>% 
      summarise(PercentGood = percent(mean(good)))
    
    # A tibble: 4 x 2
    #  Player   PercentGood
    #  <fct>    <chr>  
    #1 Jordan   77.8%  
    #2 Barkley  40.0%  
    #3 Olajuwon 60.0%  
    #4 Kemp     83.3%