Search code examples
rggplot2plotmissing-datananiar

Plot missing values by group and time


I have a dataset that looks something like this:

df <- data.frame("id" = c("Alpha", "Alpha", "Alpha","Alpha","Beta","Beta","Beta","Beta", "Gamma", "Gamma", "Gamma", "Gamma"), 
          "Year" = c(1970,1971,1972,1973,1970,1971,1972,1973,1970,1971,1972,1973), 
                 "Val" = c(2,NA,NA,5,NA,5,NA,5,1,3,4,NA))

I would like to show the panel structure of my data. Ideally, I would like to create a plot that shows missing value for every subject ordered by year. Ideally, the plot would have "Year" on the x-axis, "id" on the y-axis, and in the middle, there should be rectangles of different colors (e.g. grey=missing, blue not missing).

I have tried to use library(VIM) matrixplot() or library(naniar) gg_miss_fct() that produce similar visuals to the one I am looking for. however: 1) I only need to produce the plot for one variable and not the whole dataset (while gg_miss_fctand matrixplot plot missing values for all the variables, 2) I would like the missing values to appear ordered following the time criteria.

I thank you in advance for your help


Solution

  • ggplot(df, aes(Year, id, fill = is.na(Val))) +
        geom_tile(col = "black") +
        coord_equal()