I measured the number of occurrences of exclamation marks in the abstract and title of papers per year. Now, I want to show the distribution of this number for each individual year using a kernel density estimation. I want to plot my data in a way that I found in another publication (Plavén-Sigray et al. eLife 2017, https://elifesciences.org/articles/2772):
Do you have any idea how I could achieve this using R? I would be glad if you could provide a package. I added some toy data along with what I tested so far.
library(ggplot2)
set.seed(176)
df = data.frame(
id = seq(1:2000),
amount = sample(0:3, 2000, replace = TRUE),
year = sample(1990:2010, 2000, replace = T)
)
ggplot(df, aes(x = year, y = amount) ) +
geom_density_2d() +
geom_density_2d_filled() +
geom_density_2d(colour = "black")
I get the following result which is not really what I want:
Any help would be appreciated. Thank you in advance!
You can get a plot like this in ggplot directly without additional packages. Here's a full reprex:
set.seed(1)
df <- data.frame(year = rep(1920:2000, each = 100),
amount = rnorm(8100, rep(120:200, each = 100), 20))
library(tidyverse)
df %>%
group_by(year) %>%
summarize(Amount = density(amount, from = min(df$amount),
to = max(df$amount))$x,
Density = density(amount, from = min(df$amount),
to = max(df$amount))$y) %>%
ggplot(aes(year, Amount, fill = Density)) +
geom_raster(interpolate = TRUE) +
scale_fill_viridis_c(option = "magma") +
theme_minimal(base_size = 20) +
coord_cartesian(expand = 0) +
theme(legend.position = "top",
legend.key.width = unit(3, "cm"),
legend.title = element_text(vjust = 1))