Search code examples
rlattice

How can I add a line to my xyplot based upon the mean of an attribute of my data?


I have created the base graph I am looking to get I just can't figure out how to add a line to the graph based on the mean of the murder attribute within the USArrests dataset. After that, I also need to color the state names based upon if they fall above or below the line.

The graph I have: https://ibb.co/V3VkYt4

The graph I need: https://ibb.co/4TTnQM1

I have tried adding an abline with the Murder attributes mean as the input and the line appears outside of my graph not sure what I am doing wrong.

library(lattice)
textPlot <- function()
{
  data <- cbind(rownames(USArrests), USArrests)
  names(data) <- c("State", names(data)[2:5])

averageM <- mean(USArrests$Murder)

         xyplot(Murder~UrbanPop, data, 
         groups=State, panel=drawText, 
         main="Murder vs. Urban Population")

}

drawText <- function(x,y,groups,...)
  {
    panel.text(x=x,y=y,label=groups,cex=y/10)
}   

Solution

  • Your graph appears to show a sloped regression line rather than a horizontal line for the mean. Lattice can add a regression line in xyplot directly from the variables with panel.lmline or from a regression model (or a constant) with panel.abline. A little more work is required to classify the states that are above or below selected murder rate. Here's one way to do it with lattice showing both types of regression lines.

    # Load the lattice package, create data.frame with state names from USAarrests
      library(lattice)
      df <- data.frame(State = rownames(USArrests), USArrests)
    
    # Determine regression and mean murder rate outside of xyplot()
    # However, these operations don't have to be done outside of the lattice function
      fm <- lm(Murder ~ UrbanPop, df)
      averageM <- mean(USArrests$Murder)
    
    # Add a variable to the data.frame indicating the classification
      df$type <- factor(ifelse(df$Murder < fm$fitted, "low", "high"))
    
    # Plot via lattice with explicit panel() function
      xyplot(Murder ~ UrbanPop, data = df,
        panel = function(x, y, ...) {
          panel.abline(fm, col = "red", lwd = 2)
    #     panel.lmline(x, y, col = "red", lwd = 2) # This would do the same
          panel.abline(h = averageM, col = "red", lty = 2, lwd = 2)
    #     panel.abline(h = mean(y), col = "red", lty = 2, lwd = 2) # This would do the same
          panel.text(x, y, labels = df$State, cex = y/10, col = c(2,4)[df$type])
        }
      )