I am trying to add data about the number of NAs observed in a bar chart, as simple text -- see example below.
I keep banging my head on either filtering out all NAs prior to the ggplot call -- which forces me to rely on annotate("text" ...) to add an extra layer (dissociated from the original plot data) -- or keeping the NAs in the plot data, filtering them out in the geom_bar call and adding a geom_text ... I'm not having much luck though.
The code for a minimal working example:
df <- data.frame(y = factor(sample(letters[24:26], 50, replace=TRUE)), x = factor(sample(c(LETTERS[1:3], NA), 50, replace=TRUE)))
df[df$y=="x" & is.na(df$x),'x'] <- "A"
ggplot(df, aes(y=y, fill=x)) + geom_bar(position="fill")
...line 2 in the code is simply to ensure that one of the grouping variable instances (y) has no NAs.
The output:
The desired output:
Alternatively, you calculate the total of na per group to a new column and use that to display that on top of the bars like this:
set.seed(7)
df <- data.frame(y = factor(sample(letters[24:26], 50, replace=TRUE)), x = factor(sample(c(LETTERS[1:3], NA), 50, replace=TRUE)))
df[df$y=="x" & is.na(df$x),'x'] <- "A"
library(ggplot2)
library(dplyr)
library(tidyr)
df %>%
group_by(y) %>%
mutate(total_na = sum(is.na(x))) %>%
drop_na(x) %>%
ggplot(aes(y=y, fill=x)) +
geom_bar(position="fill") +
geom_text(aes(x = 1, label = paste0("NA: ", total_na)), hjust = 0, size = 3) +
coord_cartesian(clip = "off")
Created on 2023-09-27 with reprex v2.0.2