Search code examples
rplotggplot2bar-chartlevels

How to make ggplot2 keep unused levels on data subset


My problem is clearly not new, but I haven't been able to find my exact coding question answered. I am working from a subset of my data (available here) and have been trying all possible combinations of scale_x_discrete(drop=FALSE) and scale_fill_discrete(drop=FALSE) to try to get ggplot2 to include a space where the bar would be for Chipmunks (n=0 for event "CF" - n.b. this corresponds to the variable "forage" in the data).

The code I am using is as follows:

require(ggplot2)
library(ggthemes)

#excluding MICROs from my plot
ggplot(data[data$sps=="MAMO" | data$sps=="TAST" | data$sps=="MUVI"|    data$sps=="MUXX" | data$sps=="TAHU",], 
      aes(sps, fill=forage))+geom_bar(position="dodge") +
    labs(x = "Species", y = "Number of observations") +
    scale_x_discrete(labels = c("Marmot","American Mink", "Weasel Spp.", "Red squirrel", "Chipmunk")) +
    theme_classic() + 
    scale_fill_manual(values = c("#000000", "#666666", "#999999","#CCCCCC"), name = "Event")

I then get a plot like this one: Plot

When I add scale_x_discrete(drop = FALSE) I get this:Problem Plot What the code appears to be doing is including my previously excluded MICRO data (hence everything gets shifted over one after Marmots and Chipmunks still only have 3 bars).

When I try scale_fill_discrete(drop = FALSE) the resulting plot doesn't change at all from the first plot presented. When I try both scale_x_discrete(drop = FALSE) and scale_fill_discrete(drop = FALSE) the plot looks like the second plot presented.

I figure I can manually go and make a small table with the frequencies for each level (Event), but I would like to first try to code it properly in R.

Does anyone have any suggestions for what I could add/change in my code to do this?

Update: I tried the code suggested below:

df1 %>% 
  filter(sps != "MICRO") %>% 
  group_by(sps) %>% 
  count(forage) %>% 
  ungroup %>% 
  complete(sps, forage, fill = list(n = 0)) %>% 
ggplot(aes(sps, n)) + geom_col(aes(fill = forage), position = "dodge") +
  scale_x_discrete(labels=c("Marmot","American Mink", "Weasel Spp.", "Red squirrel", "Chipmunk")) + 
  theme_classic() + 
  scale_fill_manual(values=c("#000000", "#666666", "#999999","#CCCCCC"), name = "Event") + 
  labs(x = "Species", y = "Number of observations")

The resulting plot has the space (yay!) but still has an empty space for where MICRO would be:

second attempt


Solution

  • The issue here is that a count of zero is not generated for sps = TAST, forage = CF. You can create that count using tidyr::complete. I've also added some dplyr functions to make the code cleaner. Assuming that your data frame is named df1 (as opposed to data, which is a base function name so not a good choice):

    UPDATED: with stringsAsFactors = FALSE to address issues in comments.

    library(dplyr)
    library(tidyr)
    library(ggplot2)
    
    df1 <- read.table("data.txt", header = TRUE, stringsAsFactors = FALSE)
    df1 %>% 
      filter(sps != "MICRO") %>% 
      group_by(sps) %>% 
      count(forage) %>% 
      ungroup %>% 
      complete(sps, forage, fill = list(n = 0)) %>% 
      ggplot(aes(sps, n)) + geom_col(aes(fill = forage), position = "dodge") +
        scale_x_discrete(labels=c("Marmot","American Mink", "Weasel Spp.", "Red squirrel", "Chipmunk")) + 
        theme_classic() + 
        scale_fill_manual(values=c("#000000", "#666666", "#999999","#CCCCCC"), name = "Event") + 
        labs(x = "Species", y = "Number of observations")
    

    Result: enter image description here