Search code examples
rggplot2dplyrgeom-bar

Percentages of a variable in another variable using dplyr and creating a boxplot with standard deviation


I have this df. I wish to make glasses into a factor with a level <=1.5 and >1.5. hereafter I want to examine how many percent of in both levels have a ciss value above 16. Each levels are considered as one group, so that should count as 100%.

glasses <- c(1.0,1.1,1.1,1.6,1.2,1.7,2.2,5.2,8.2,2.5,3.0,3.3,3.0,3.0)
ciss <- c(2,9,10,54,65,11,70,54,0,65,8,60,47,2)
df <- cbind(glasses, ciss)
df

I want a outcome looking like

glasses    Percentages ciss > 16
<=1.5      xx%
>1.5       xx%

I tried using dplyr

dfnew <- df %>% mutate(ani=cut(glasses, breaks=c(-Inf, 1.5, Inf), 
                         labels=c("<=1.5",">1.5")))
dfnew %>% group_by(ani) %>% mutate(perc = ciss>16 / sum(ciss))

And lastly, I would like to demonstrate the percentages in boxplot (glasses on the x axis, percentages of ciss above 16 on the y axis).


Solution

  • try this.

    require(tidyverse)
    require(ggplot2)
    require(reshape2)
    
    #Input data
    glasses = c(1.0,1.1,1.1,1.6,1.2,1.7,2.2,5.2,8.2,2.5,3.0,3.3,3.0,3.0)
    ciss = c(2,9,10,54,65,11,70,54,0,65,8,60,47,2)
    
    #Bind in dataframe
    df = as.data.frame(cbind(glasses,ciss))
    
    df %>%
       mutate(typglass = if_else(glasses > 1.5,">1.5","<=1.5")) %>%
       filter(ciss > 16) %>%
       group_by(typglass) %>%
       summarise (n = n()) %>%
       mutate(freq = n / sum(n)) %>%
       ggplot() +
       geom_bar(aes(x = typglass, y = freq, fill = typglass), stat = "identity", width = 0.5) +
       theme_classic()
    

    Gives the following result: Result