Search code examples
rdataframeggplot2plotgeom-bar

How to create a bar plot with one categorical variable in different years in ggplot2?


I have a very large data frame where each row in the first column represents an id with numbers. The other rows have a categorical variable that can be of two types (in this example, A or B), each for a year. Here's a simplified data frame as an example:

id  var2017  var2018  var2019
1     A        B         A
2     B        A         A
3     B        A         B
4     A        A         A
5     A        B         B

I'd like to create a bar plot that contains the count of each type (A and B) for each year, with the bars being grouped by type. I am new with R language, so I've tried to create a plot for the years separately, which works fine, as follows:

graph <– ggplot(data = example) +
        geom_bar(aes(x = var2017))

The problem is I don't know how to put them all together. How can I create a plot with all the types for each year being in the x axis, and the count in the y axis? The id doesn't need to be in the output.


Solution

  • The way to plot multiple columns in ggplot is to first convert the data to long form, which can be done with tidyr::gather. Then you map the column it came from (now stored in the "year" column) to one aesthetic, and the count to another (geom_bar does this for you by counting the number of rows).

    library(tidyverse);  
    ggplot(data = example %>%
             gather(year, type, -id)) +
      geom_bar(aes(x = year, fill = type), position = "dodge")
    

    enter image description here

    (Note, I changed the example to make the different years have different counts. Otherwise it's less clear to see if it's working.)

    example <- read.table(
      header = T, 
      stringsAsFactors = F,
      text = "id  var2017  var2018  var2019
               1       A        B         A
               2       B        A         A
               3       B        A         B
               4       B        A         A     # var2017 A changed to B
               5       A        B         B")