Search code examples
raggregatedata-manipulationsummarization

Using no 3rd party packages, is there a way to calculate percentage of row for counts on categorical data?


I have something of an abnormal situation where I can't currently download 3rd party packages to my setup of R. Taking this as a constraint, is there a way to summarize the following data of restaurant locations and closed/open status?

A count(business,vars=c("city","open")) on my data gives me something like this:

"City"       "Open"   "Frequency"
Wickenburg   False    2
Wickenburg   True     26
Wittmann     True     2
Wittmann     False    2
Youngtown    True     7
Yuma         True     1

This is a frequency table of how many restaurants are both open and closed in a given city.

I want to find percentage by group. Example output would look like this

"City"       "Open"   "Frequency"    "Pct of City"
Wickenburg   False    2               7.7
Wickenburg   True     26              92.3
Wittmann     True     2               50.0
Wittmann     False    2               50.0
Youngtown    True     7               100.0
Yuma         True     1               100.0

What's the easiest way to do that in vanilla R?


Solution

  • Try this:

    transform(DF, Pct = 100 * ave(Frequency, City, FUN = prop.table))
    

    which gives:

            City  Open Frequency        Pct
    1 Wickenburg False         2   7.142857
    2 Wickenburg  True        26  92.857143
    3   Wittmann  True         2  50.000000
    4   Wittmann False         2  50.000000
    5  Youngtown  True         7 100.000000
    6       Yuma  True         1 100.000000