I am trying to create a forest plot in R using the forestplot
package. I want to assign different colors to the points and lines based on whether the log.est
values are above or below 0. Specifically, I want:
log.est > 0
log.est <= 0
Here is my data:
df <- structure(list(variable = c("N.Acetylputrescine", "Homocitrulline", "Argininic.acid",
"SM.C16.1", "Oxalic.acid", "Cer.d18.1.22.0.", "Citrulline",
"X3.Hydroxybutyric.acid", "Glycine", "Cer.d18.1.25.0.",
"Uridine", "Cer.d18.1.24.1.", "Deoxyguanosine", "Adenine"),
log.est = c(18.12, 11.70, 11.61, 9.95, 8.79, 8.72, 7.07, 4.13, 2.63,
-5.85, -6.47, -6.81, -10.47, -14.84),
p.value = c(9.49e-05, 0.000196, 0.0117, 0.137, 7.44e-05, 0.251, 0.514,
0.000162, 0.0376, 0.909, 0.000858, 0.345, 0.000531, 1.4e-05),
log.lower = c(17.12, 10.62, 9.44, -8.30, 7.81, -8.23, -8.08, 3.08,
-1.49, -10.03, -7.14, -8.43, -11.12, -15.37),
log.upper = c(18.71, 12.31, 12.44, 11.16, 9.38, 10.16, 9.08, 4.74,
3.59, 9.86, -5.19, 6.91, -9.27, -13.97)
), class = "data.frame", row.names = c(NA, -14L))
Here is the code I used to generate the forest plot:
library(forestplot)
forestplot(
labeltext = cbind("Variable" = df$variable,
"Coefficient" = round(df$log.est, 2),
"P-value" = format.pval(df$p.value)),
mean = df$log.est,
lower = df$log.lower,
upper = df$log.upper,
xlab = "Effect Size",
title = "Logistic Regression"
)
This generates the forest plot, but all points and lines are the same color. How can I assign blue to points with log.est > 0
and red to points with log.est <= 0
?
I tried below code: But it gave me blue color for all the data.
# Define colors based on log.est values
df$color <- ifelse(df$log.est > 0, "blue", "red")
# Create the forest plot
forestplot(labeltext = cbind("Variable" = df$variable,
"Coefficient" = round(df$log.est, 2),
"P-value" = format.pval(df$p.value)),
mean = df$log.est,
lower = df$log.lower,
upper = df$log.upper,
xlab = "Effect Size",
title = "Logistic Regression",
boxsize = 0.3, # Size of the points
txt_gp = fpTxtGp(label = gpar(cex = 0.8)), # Text size customization
col = fpColors(box = df$color, line = df$color, zero = "black"))
Any help or suggestions would be greatly appreciated. Thank you!
You can try:
# Order df by log.est
df <- df[order(df$log.est, decreasing = TRUE), ]
# Add a color column for positive and negative values
df$color <- ifelse(df$log.est >= 0, "blue", "red")
# Forest plot
ggplot(df, aes(x = log.est, y = reorder(variable, log.est), xmin = log.lower, xmax = log.upper, color = color)) +
geom_pointrange() +
geom_vline(xintercept = 0, linetype = "dashed", color = "black") +
scale_color_identity() +
labs(title = "Forest Plot with Color-Coded Estimates",
x = "Log Estimate",
y = "Variable") +
theme_minimal() +
theme(axis.text.y = element_text(size = 10), axis.title.y = element_blank())