My data frame looks like these:
I'm trying to plot the mean_LFC of all the barcodes (red lines, x axis) for each correspondent TF_variant (y-axis) and I want to show in the background a density plot of the mean_LFC values of my entire dataset for each TF_variant. So basically each row along the y_Axis should show the same distribution background plot as shown in the last row in the picture but I can't figure out how to plot that for each single row/TF_variant.
This is my code:
ggplot(significant_genes_df, aes(x = mean_LFC, y=TF_variant)) +
stat_density(aes(y=1, fill = after_stat(density)), geom = "raster", position = "identity") +
scale_fill_gradient(low = "gray95", high = "gray35", name = "Density") +
scale_y_discrete(expand = c(0, 0.5)) +
scale_x_continuous(expand = c(0, 0)) +
theme_ridges(font_size = 14, font_family = "", grid = FALSE, center = TRUE) +
geom_point (shape = "|", color = "red", alpha = 0.45, size = 3, position = position_jitter(height = 0)) +
geom_vline(xintercept = 0, linetype = "dotted", color = "steelblue3")# Change the color gradient
In stat_density I put y=1 because I want to plot the distribution of the entire dataset, not the values of each TF_variant but I want to show that plot for each TF_variant in the y-axis.
I would probably do this using geom_tile
instead of geom_raster
. Unlike geom_raster
, geom_tile
allows you to specify height. It looks like you have 41 rows, so you could make the y value 21 and the height 41:
library(ggplot2)
ggplot(significant_genes_df, aes(x = mean_LFC, y=TF_variant)) +
stat_density(aes(y = 21, fill = after_stat(density),
color = after_scale(fill),
width = mean(after_stat(diff(x))) * 2),
geom = "tile", linewidth = 0,
height = 41, position = "identity") +
scale_fill_gradient(low = "white", high = "gray50", name = "Density") +
scale_y_discrete(expand = c(0, 0.5)) +
scale_x_continuous(expand = c(0, 0)) +
ggridges::theme_ridges(font_size = 14, font_family = "",
grid = FALSE, center = TRUE) +
geom_point(shape = "|", color = "red", alpha = 0.45, size = 3,
position = position_jitter(height = 0)) +
geom_vline(xintercept = 0, linetype = "dotted", color = "steelblue3")
Data used
We don't have any data we can use, so I created this set, which has the same names and structure as the data in the question:
set.seed(22)
significant_genes_df <- data.frame(
TF_variant = rep(apply(replicate(41, sample(LETTERS, 10)), 2,
paste, collapse = ""), times = rpois(41, 5)))
significant_genes_df$mean_LFC <- rnorm(nrow(significant_genes_df), sd = 3)
significant_genes_df <- subset(significant_genes_df, abs(mean_LFC) < 8)