Search code examples
rsummatchapplysumifs

Find a percentage based on multiple columns of criteria in R


I have multiple columns and I would like to find the percentage of a one column in the other columns are the same. For example;

ST  cd  variable
1   1   23432
1   1   2345
1   2   908890
1   2   350435
1   2   2343432
2   1   9999
2   1   23432 

so what I'd like to do is:

if ST and cd are the same, then find the percentage of variable for that row over all with the same ST and cd. So in the end it would look like:

ST  cd  variable  percentage
1   1   23432     90.90%
1   1   2345      9.10%
1   2   908890    25.30%
1   2   350435    9.48%
1   2   2343432   65.23%
2   1   9999      29.91%
2   1   23432     70.09%

How can I do this in R?

Thanks for all the help.


Solution

  • You can create your proportion format function:

    prop_format <- 
    function (x, digits=4) 
    {
      x <- round(x/sum(x), digits)*100
      paste0(x,'%')
    }
    

    Then using ave :

    ave(dt$variable,list(dt$ST,dt$cd),FUN=prop_format)
    
    [1] "90.9%"  "9.1%"   "25.23%" "9.73%"  "65.05%" "29.91%" "70.09%"