Search code examples
rvisualizationboxplotbinning

How to do multiboxplot for a single variable based on bins from cut function in r?


I am trying to create a multibox plot for a variable based on number of bins from cut function

movie_reg %>% select(Collection) %>% pull() %>% cut(7) 

[1] (3.57e+04,4.86e+04] (3.57e+04,4.86e+04] (6.14e+04,7.43e+04] (6.14e+04,7.43e+04] (6.14e+04,7.43e+04]
  [6] (4.86e+04,6.14e+04] (3.57e+04,4.86e+04] (3.57e+04,4.86e+04] (2.29e+04,3.57e+04] (3.57e+04,4.86e+04]
 [11] (2.29e+04,3.57e+04] (3.57e+04,4.86e+04] (3.57e+04,4.86e+04] (3.57e+04,4.86e+04] (3.57e+04,4.86e+04]
.
.
[501] (2.29e+04,3.57e+04] (3.57e+04,4.86e+04] (3.57e+04,4.86e+04] (3.57e+04,4.86e+04] (3.57e+04,4.86e+04]
[506] (3.57e+04,4.86e+04]
7 Levels: (9.91e+03,2.29e+04] (2.29e+04,3.57e+04] (3.57e+04,4.86e+04] (4.86e+04,6.14e+04] ... (8.71e+04,1e+05]

I am not sure how exactly I will pass levels and corresponding values to it in boxplot. Below is what I have tried but getting error:

movie_reg %>% select(Collection) %>% pull() %>% cut(7) %>% boxplot(aes(x=levels))
Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 'x' must be atomic

Solution

  • I think you mixed up boxplot() from base R with geom_boxplot() from ggplot2. In any case, if your question is about visualization the categories obtained from cut(), you can add the column using:

    movie_reg = data.frame(Collection = runif(100))
    movie_reg %>% mutate(levels = cut(Collection,7))
    

    Use boxplot:

    boxplot(Collection ~ levels,data=movie_reg %>% mutate(levels = cut(Collection,7)),horizontal=TRUE,las=2,cex.axis=0.6)
    

    enter image description here

    Or ggplot2 :

    movie_reg %>% mutate(levels = cut(Collection,7)) %>% ggplot(aes(x=levels,y=Collection)) + geom_boxplot() + coord_flip()
    

    enter image description here