Search code examples
rdataframemeancategorical-data

Calculating the proportions of Yes or No responses in R


I am new to R and really trying to wrap my head around everything (even taking online course--which so far has not helped at all).

What I started with is a large data frame containing 97 variables pertaining to compliance with regulations.

I have created multiple dataframes based on the various geographic locations (there is probably an easier way to do it).

In each of these dataframes, I have 7 variables I would like to find the mean of "Yes" and "No" responses.

I first tried:

    summary(urban$vio_bag)
   Length     Class      Mode 
      398 character character

However, this just tells me nothing useful except that I have 398 responses.

So I put this into a table:

urbanbag<-table(urban$vio_bag)

This at least provided me with the number of Yes and No responses

 Var1  Freq
1 No   365
2 Yes  30

So I then converted to a data.frame:

urbanbag = as.data.frame(urbanbag)

Then viewed it:

 summary(urbanbag)

     Var1        Freq      
 No :1   Min.   : 30.0  
 Yes:1   1st Qu.:113.8  
         Median :197.5  
         Mean   :197.5  
         3rd Qu.:281.2  
         Max.   :365.0  

And the output still definitely did not help.. much more useless actually. I am not building these Matrices in R. It is a table imported from excel.

I am just so lost and frustrated having spent days trying to figure out something that seems so elementary and googling help which did not work out.

Is there a way to actually do this?


Solution

  • We can use prop.table to get the proportion

    v1 <- prop.table(table(urban$vio_bag))
    

    then use barplot to plot it

    barplot(v1)