Search code examples
rggplot2geom-barfactors

How to generate sub-plots with grouped categorical variable sorted by a numeric variable in ggplot?


I have a dataframe text with count n of word appearing in each file file_num = 1 or 2 or 3. I would like to use ggplot to generate three subplots, one for each value of file_num, with word on the y-axis and the frequency n on x-axis. I want each sub-plot to be sorted according to increasing or decreasing value of n observed withing each file_num. I have tried many different ways to solve this seemingly trivial issue but have not been successful.

Here is dput of my test data:

structure(list(file_num = c("1", "1", "1", "1", "2", "2", "2", 
"2", "2", "3", "3", "3", "3", "3"), word = c("test", "quality", 
"page", "limit", "information", "limit", "test", "instruments", 
"quality", "limit", "test", "effective", "page", "system"), n = c(5, 
35, 55, 75, 20, 30, 40, 60, 70, 101, 201, 301, 401, 501)), class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -14L), spec = structure(list(
    cols = list(file_num = structure(list(), class = c("collector_character", 
    "collector")), word = structure(list(), class = c("collector_character", 
    "collector")), n = structure(list(), class = c("collector_double", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
    "collector")), skip = 1), class = "col_spec"))

Here is what I have tried:

library(tidytext)
library(stringr)
library(pdftools)
library(dplyr)
library(purrr)
library(ggplot2)
library(forcats)
text %>% group_by(file_num) %>% arrange(file_num, desc(n)) %>%
    ggplot(.,aes(factor(word,levels = unique(word)), n, fill = file_num)) + 
    geom_bar(stat = "identity", position = "dodge") +
    scale_x_discrete("Word") +
    scale_y_continuous("n")  + coord_flip() +
    facet_grid(rows = vars(file_num), scales = "free")

Here is the plot that is generated using the above code on dataframe text created using the dput data. It shows the desired result (word sorted with increasing value of n) for file_num = 1, but not for file_num = 2 or 3: enter image description here


Solution

  • You could achieve this "ordered per facet" quite simply with the ggcharts package, using the following code on your data:

    library(ggcharts)
    bar_chart(data = text, x = word, y = n, 
      fill = file_num,
      facet = file_num,
      horizontal = TRUE
    )
    

    This yields the following graph:

    enter image description here

    Please, let me know whether this is what you want.

    Update:

    The object created by bar_chart is of class ggplot, as can be seen below:

    class(chart)
    [1] "gg"     "ggplot"
    

    This means that one can use the ggplot2 functions to alter the graph, e.g.:

    chart + 
      guides(fill=FALSE) +      ## remove legend 
      ggtitle("My new title") + ## add title
      theme_linedraw() +
      theme(strip.background = element_rect(colour = "red", size = 2))
    

    yielding the following pic (for illustration only) :

    enter image description here