I have created the base graph I am looking to get I just can't figure out how to add a line to the graph based on the mean of the murder attribute within the USArrests dataset. After that, I also need to color the state names based upon if they fall above or below the line.
The graph I have: https://ibb.co/V3VkYt4
The graph I need: https://ibb.co/4TTnQM1
I have tried adding an abline with the Murder attributes mean as the input and the line appears outside of my graph not sure what I am doing wrong.
library(lattice)
textPlot <- function()
{
data <- cbind(rownames(USArrests), USArrests)
names(data) <- c("State", names(data)[2:5])
averageM <- mean(USArrests$Murder)
xyplot(Murder~UrbanPop, data,
groups=State, panel=drawText,
main="Murder vs. Urban Population")
}
drawText <- function(x,y,groups,...)
{
panel.text(x=x,y=y,label=groups,cex=y/10)
}
Your graph appears to show a sloped regression line rather than a horizontal line for the mean. Lattice can add a regression line in xyplot
directly from the variables with panel.lmline
or from a regression model (or a constant) with panel.abline
. A little more work is required to classify the states that are above or below selected murder rate. Here's one way to do it with lattice showing both types of regression lines.
# Load the lattice package, create data.frame with state names from USAarrests
library(lattice)
df <- data.frame(State = rownames(USArrests), USArrests)
# Determine regression and mean murder rate outside of xyplot()
# However, these operations don't have to be done outside of the lattice function
fm <- lm(Murder ~ UrbanPop, df)
averageM <- mean(USArrests$Murder)
# Add a variable to the data.frame indicating the classification
df$type <- factor(ifelse(df$Murder < fm$fitted, "low", "high"))
# Plot via lattice with explicit panel() function
xyplot(Murder ~ UrbanPop, data = df,
panel = function(x, y, ...) {
panel.abline(fm, col = "red", lwd = 2)
# panel.lmline(x, y, col = "red", lwd = 2) # This would do the same
panel.abline(h = averageM, col = "red", lty = 2, lwd = 2)
# panel.abline(h = mean(y), col = "red", lty = 2, lwd = 2) # This would do the same
panel.text(x, y, labels = df$State, cex = y/10, col = c(2,4)[df$type])
}
)