Search code examples
rdplyrconditional-statementscriteriapercentage

Conditional percentages calculations - R


I have this df:

> day <- c(1,1,1,1,2,2,2,2,3,3,3,3)
> Salesperson<- c("Chris", "Phil", "Joy", "Jess", "Chris", "Phil", "Joy", "Jess", "Chris", "Phil", "Joy", "Jess")
> Sales<-c(32,54,65,43,87,54,21,65,75,75,47,56)
> df <- cbind(day, Salesperson,Sales)
> df
      day Salesperson Sales
 [1,] "1" "Chris"     "32" 
 [2,] "1" "Phil"      "54" 
 [3,] "1" "Joy"       "65" 
 [4,] "1" "Jess"      "43" 
 [5,] "2" "Chris"     "87" 
 [6,] "2" "Phil"      "54" 
 [7,] "2" "Joy"       "21" 
 [8,] "2" "Jess"      "65" 
 [9,] "3" "Chris"     "75" 
[10,] "3" "Phil"      "75" 
[11,] "3" "Joy"       "47" 
[12,] "3" "Jess"      "56" 

I need to compute the percentage of sales per day for each salesperson. I have been trying with dpply or adding a new column with those values, but none of them works. An example:

For day 1, Chris, and sales = 32, the percentage should be 32/(32+54+65+43)


Solution

  • There is no need to use the group_by() and ungroup() multiple times. You can just do this using tidyverse:

    day <- c(1,1,1,1,2,2,2,2,3,3,3,3)
    Salesperson<- c("Chris", "Phil", "Joy", "Jess", "Chris", "Phil", "Joy", "Jess", "Chris", "Phil", "Joy", "Jess")
    Sales<-c(32,54,65,43,87,54,21,65,75,75,47,56)
    df <- cbind(day, Salesperson,Sales)
    
    df <- as.data.frame(df)
    
    df$Sales <- as.numeric(as.character(df$Sales))
    
    
    df %<>% 
      group_by(day) %>% 
      mutate(perc = Sales / sum(Sales))
    
    > df
    A tibble: 12 x 4
    Groups:   day [3]
       day   Salesperson Sales   perc
       <fct> <fct>       <dbl>  <dbl>
     1 1     Chris         32. 0.165 
     2 1     Phil          54. 0.278 
     3 1     Joy           65. 0.335 
     4 1     Jess          43. 0.222 
     5 2     Chris         87. 0.383 
     6 2     Phil          54. 0.238 
     7 2     Joy           21. 0.0925
     8 2     Jess          65. 0.286 
     9 3     Chris         75. 0.296 
    10 3     Phil          75. 0.296 
    11 3     Joy           47. 0.186 
    12 3     Jess          56. 0.221 
    

    PROOF:

    df$perc[1]  == 32/(32+54+65+43)
    

    gives

     [1] TRUE