Search code examples
rbar-chartcombinedchart

How to display 2 sets of stacked data in one barplot


So i'm creating an MI pack and i wanted to include a barplot that displays when something was submitted and who was at fault for lateness. I have 5 columns of data which can be split into two sections:

Section 1 (submission time) = Pre 7am, 7am-9am, Post 9am

Section 2 (fault) = us, them There are 6 rows of data (i.e., previous 6 months).

No problems with reading the data, manipulating it, changing the dates or formatting. Likewise no issues with creating barplots of either the whole set of data stacked, or each section stacked (which is what I want).

What i would like is for section 1 to be a stacked barplot, and next to it have section 2.

Recreated in excel in case description not clear

This is the code I have so far. Downloading and manipulating data to suitable format

MIDF <- read.xlsx("C:/MISMM.xlsx", sheet = 11, startRow = 3, colNames = TRUE)
MDATE <- excel_numeric_to_date(MIDF$`Month./.Year`, date_system = "modern", include_time = "false")
MDATE <- format(as.Date(MDATE), "%m-%Y")
MIDF <- cbind(MDATE, MIDF[2:19])
MIDF <- tail(MIDF, 6)

This is all 5 columns stacked

LAFU <- MIDF %>% select (9, 10, 11, 12, 13)
LAFUa <- MIDF %>% select(1)
LAFUa <- (t(LAFUa))
LAFU <- (t(LAFU))
LAFUCHART <- barplot(LAFU, 
         names.arg=LAFUa, 
         main = "Lates and who is at fault",
         col=c("#1488CA", "#6B7E87", "#AA0B3C", "#FDC41F",  "#85C9F0"), 
         legend = rownames(LAFU), 
         beside = FALSE)

And this the two sections seperate

LAFUresub <- MIDF %>% select (9, 10, 11)
LAFUerror <- MIDF %>% select (12, 13)
LAFUa <- MIDF %>% select(1)
LAFUa <- (t(LAFUa))
LAFUresub <- (t(LAFUresub))
LAFUerror <- (t(LAFUerror))

 LAFUresubBP <- barplot(LAFUresub, 
    names.arg=LAFUa, 
    main = "Lates",
    col=c("#1488CA", "#6B7E87", "#AA0B3C"), 
    legend = rownames(LAFUresub), 
    beside = FALSE)

 LAFUerrorBP <- barplot(LAFUerror, 
    names.arg=LAFUa, 
    main = "Who is at fault",
    col=c("#FDC41F",  "#85C9F0"), 
    legend = rownames(LAFUerror), 
    beside = FALSE)

and finally - my attempt at having them replicate the picture. It's a bit lame but i honestly have no idea where to start.

LAFUTIME <- t(group_by(MIDF %>% select(9,10,11)))
LAFUERROR <- t(group_by(MIDF %>% select(12, 13)))
LAFUDATE <- t(MIDF %>% select(1))
test<-rbind(LAFUTIME, LAFUERROR)
barplot(LAFUTIME %>% LAFUERROR)

any help greatly appreciated. David

Here is some code for a dataframe that is comparable to what I am using, if that will help.

pre7am <- c(1,2,3,2,1,3)
SamNam<- c(2,4,3,6,5,3)
post9am<- c(1,2,1,0,1,0)
us <-     c(0,0,1,3,2,0)
them <-   c(4,8,6,5,5,6)
dates <- c("Jul18", "Aug18", "Sept18", "Oct18", "Nov18", "Dec18")

DF <- data.frame(pre7am, SamNam, post9am, us, them, row.names = dates)

Solution

  • So I used the following, which is pretty close:

    library(tidyverse)
    library(zoo)
    
    DF %>% 
    mutate(Dates = rownames(DF),
    Dates = factor(Dates, levels = c("Jul18","Aug18", "Sep18", "Oct18", "Nov18", "Dec18"))) %>% 
    gather(vartype, value, -Dates) %>% 
    mutate(Subsection =paste("Subsection", abs(as.numeric(grepl("am", vartype)) - 1) + 1)) %>% 
    ggplot(aes(Subsection,value, fill = vartype, col = vartype)) + 
    geom_col() +
    facet_grid(.~Dates) 
    

    So first we move the Dates from rownames to an actual variable. Then we make sure that the Dates are actually an ordered factor (needed for facet_grids). Then we format data into long format using gather. Subsection is created based on the two kinds of stacks you want, where we check if "am" is in vartype (boolean) and then we create a function where we create Subsection1 if "am" is in vartype and Subsection2 if "am" is not present. Then since we have the values for column heights, then we use geom_col. We also use facet_grid to break the plots by Dates and arrange accordingly.