Search code examples
rlattice

Using panel.linejoin with missing data


This question is very much related to the question and answers received here, where @Mr. Flick helped me with a question I had regarding the xyplot in the lattice package. But seeing that I'm now trouble-shooting some code I thought I'd ask to the "broader public" for some help.

I've been asked by the reviewers of our paper, to present patient body mass index follow-up data similarly to the way we presented their intraoperative data in the link I provided above.

When I plot the data in an analog fashion, the black line representing "mean" stops at three months, but I want it to go through all time points. See image below.

Here's my data called bmi_data

dput(bmi_data)
structure(list(StudyID = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 
7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 
2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L), .Label = c("P1", 
"P2", "P3", "P4", "P5", "P6", "P7"), class = "factor"), BMI = c(37.5, 
43.82794785, 48.87848306, 39.93293705, 42.76788399, 39.44207394, 
50.78043704, 25.61728395, 37.91099773, 39.02185224, 36.00823045, 
37.75602259, 34.06360931, 39.12591051, 25.98765432, 34.89937642, 
32.95178633, 35.62719098, 35.75127802, 32.27078777, NA, 23.61111111, 
32.34835601, NA, 34.33165676, NA, 26.53375883, 35.79604579, 23.20987654, 
31.71060091, NA, 34.29355281, NA, NA, NA), BMITIME2 = structure(c(5L, 
5L, 5L, 5L, 5L, 5L, 5L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L), .Label = c("12 months FU", "3 months FU", "6 months FU", 
"Over 12 months FU", "Preoperative BMI"), class = "factor"), 
    TIME2 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 
    4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c("Preoperative BMI", 
    "3 months FU", "6 months FU", "12 months FU", "Over 12 months FU"
    ), class = "factor")), .Names = c("StudyID", "BMI", "BMITIME2", 
"TIME2"), class = "data.frame", row.names = c(NA, -35L))

Some data.frame manipulation to get the right order of my time-points.

bmi_data$TIME2 <- factor(bmi_data$BMITIME2, unique(bmi_data$BMITIME2))

And now my code that doesn't seem to be working properly.

require(lattice)
stderr <- function(x) sqrt(var(x,na.rm=TRUE)/length(na.omit(x)))
panel.sem <- function(x, y, col.se=plot.line$col, alpha.se=.10, ...) {
  plot.line <- trellis.par.get("plot.line")
  xs <- if(is.factor(x)) {
    factor(c(levels(x) , rev(levels(x))), levels=levels(x))
  } else {
    xx <- sort(unique(x))
    c(xx, rev(xx))
  }
  means <- tapply(y,x, mean, na.rm=T)
  stderr <- tapply(y,x, stderr)
  panel.polygon(xs, c(means+stderr, rev(means-stderr)), col=col.se, alpha=alpha.se)}

xyplot(BMI~bmi_data$TIME2, groups=StudyID, data=bmi_data, ty=c("l", "p"), 
       panel = function(x, y, ...) {
           panel.sem(x,y, col.se="grey")
           panel.xyplot(x, y, ...)
           panel.linejoin(x, y, horizontal = FALSE ,..., col="black", lty=1, lwd=4)}
       ,xlab="Measurement Time Point", 
       ylab=expression("BMI"~"(kg/m^2)"))

Which results in this plot: The problematic plot in lattice

Any help for solving this question is greatly appreciated!!!


Solution

  • The problem is that you have missing data (NA) values in this data set. The panel.linejoin() calls mean() over the observations at each x and if there are NA vales, by default the mean will be NA and then a line won't be drawn. To change that, you can specify a function wrapper to panel.linejoin. Try

    xyplot(BMI~bmi_data$TIME2, groups=StudyID, data=bmi_data, ty=c("l", "p"), 
        panel = function(x, y, ...) {
            panel.sem(x,y, col.se="grey")
            panel.xyplot(x, y, ...)
            panel.linejoin(x, y, horizontal = FALSE ,..., col="black", 
                lty=1, lwd=4, na.rm=T, 
                fun=function(x) mean(x, na.rm=T))
        }, 
        xlab="Measurement Time Point", 
        ylab=expression("BMI"~"(kg/m^2)")
    )