Search code examples
rggplot2control-charts

control charts plotting all dates in R


I have problems with qic charts - control charts. my x.axis does not plot all the dates I want. I rounded the dates to every 14 days, and the period is of 59 weeks. I want all of these plotted, yet I have issues around that and could not find anything on that online. Yet, I am new to the control charts.

Here is an example, not the original data though, so the number of weeks are less here in this example but it does not matter as long as all dates are plotted.

Reproducing the data:

df <- data.frame(x = rep(1:24, 4), 
ReportMonth = (rep(seq(as.Date('2014-1-1'),
                               length.out = 24,
                               by = 'month'),
                                4)),
             num = rbinom(4 * 24, 100, 0.5),
             denom = round(runif(4 * 24, 90, 110)),
             grp1 = rep(c('g', 'h'), each = 48),
             grp2 = rep(c('A', 'B'), each = 24))
df

And plotting

qic(x= ReportMonth,
y= num,
n= denom,
data=df, 
chart= "i",
x.format="%Y-%m-%d",
x.angle = 90,
y.expand = 40, # where to start y axis from
xlab = "Month",
ylab= "Value")

I have tried with ggplot2 yet, I have not succeeded.

 library(ggplot2)
 library(plyr)

 p3.1 <- rename(p3, c("x" = "Date"))
 p3.1$Date<-as.Date(p3.1$x, format="%Y/%m/%d")

 plot4 <- ggplot(p3.1, aes(x = Date,y = y )) +
 geom_ribbon(ymin = p3.1$lcl, ymax = p3.1$ucl, alpha = 0.4) +   # fill = ""
 geom_line(colour = "blue", size = .75) + 
 geom_line(aes(Date, cl)) +
 geom_point(colour = "red" , fill = "red", size = 1.5) +
 #x.axis(1, p3$x, format(p3$x, "%Y-%m-%d"), cex.axis = 0.7)+
 ggtitle(label = "Readmissions within 30 days") +
 labs(x = NULL, y = NULL)+
 theme_minimal()+
 theme(axis.text.x = element_text(angle = 90, hjust = 1))

#aes(x = format(ActiveDate,"%Y-%m"), group = 1)) + geom_line(stat = "count") 
#+ theme(axis.text.x = element_text(angle = 90, hjust = 1)) 

plot4

Solution

  • You have two possible values to plot: num and denom. For simplicity sake I'm going to calculate the percent of these two values and plot pct. (But you can certainly choose to plot num or denom instead.)

    Also, based on your data frame, df, you have four groups of values:

    • Group 1: grp1 = g, grp2 = A
    • Group 2: grp1 = h, grp2 = A
    • Group 3: grp1 = g, grp2 = B
    • Group 4: grp1 = h, grp2 = B

    Part of the problem you are having is you need to plot each group separately, but you are not including these groups when you plot with qicharts2::qic() or ggplot2::ggplot(). To do so you have to first combine grp1 and grp2 into one group (grp).

    library(tidyverse)
    library(qicharts2)
    
    df_2 <- 
      df %>% 
      # calculate percent
      mutate(pct = round(num/denom, digits = 2)) %>%
      # collapse grp1 and grp2 to make single grp column
      unite(grp1, grp2, col = "grp")
    
    head(df_2)
      x ReportMonth num denom grp  pct
    1 1  2014-01-01  46   100 g_A 0.46
    2 2  2014-02-01  54   105 g_A 0.51
    3 3  2014-03-01  49   100 g_A 0.49
    4 4  2014-04-01  56    94 g_A 0.60
    5 5  2014-05-01  54   102 g_A 0.53
    6 6  2014-06-01  48   106 g_A 0.45
    

    It is perfectly fine to plot multiple groups on a line chart (time series).

    ggplot(df_2, aes(x = ReportMonth, y = pct, color = grp)) +
      geom_line() +
      scale_x_date(date_breaks = "2 months", date_labels = "%b '%y") +
      scale_y_continuous(labels = scales::percent) +
      theme_minimal()
    

    enter image description here

    But you should not plot multiple groups on a single control chart. The control limits on a control chart are based on a single series (group's) historic values. If you plot all four groups on the same control chart you would get four sets of control limits which would make for a very confusing (almost impossible to read/interpret) control chart.

    Instead you should plot four control charts, one for each group.

    df_2 %>% 
      # nested dataframe
      split(.$grp) %>% 
      # apply qic
      purrr::map(~ qicharts2::qic(
        ReportMonth, pct, 
        data = ., 
        chart = "i", # choose an appropriate control chart
        title = paste("Group:", unique(.$grp)),
        xlab = "ReportMonth",
        ylab = "pct"
        ))
    

    enter image description here

    EDIT:

    I could not find any parameter in qicharts2::qic() that specifies breaks (similar to the scale_x_date(breaks = ...) function in ggplot). See reference manual here.

    However, a possible work around is to convert the date variable into a factor and use that instead. The downside to this approach is there is no line connecting the dots.

    # Set levels for date variable -- ensure they are unique.
    ReportMonth_levels <- format( unique(df_2$ReportMonth), "%b %y")
    
    df_3 <- 
      df_2 %>% 
      # convert date variale to a factor with set levels
      mutate(ReportMonth = factor( format(ReportMonth, "%b %y"), levels = ReportMonth_levels))
    
    df_3 %>% 
      qicharts2::qic(
        ReportMonth, pct, 
        data = ., 
        facets = ~ grp, # put all groups on one chart
        y.percent = TRUE,
        x.angle = 45,
        chart = "i", # choose an appropriate control chart
        xlab = "ReportMonth",
        ylab = "pct"
      )
    

    enter image description here