I am pretty new to ggplot2 and I would like to draw a histogram of the number of articles published per year (or 5 years) for a systematic review. I have a df like that:
Df <- data.frame( name = c("article1", "article2", "article3", "article4"),
date = c(2004, 2009, 1999, 2007),
question1 = c(1,0,1,0),
question2 = c(1,1,1,1),
question3 = c(1,1,1,1),
question4 = c(0,0,0,0),
question5 = c(1,0,1,0), stringsAsFactors = FALSE )
ggplot(Df, aes (date))+
geom_histogram(binwidth = 5, color= "black")
Plus, for each bar of the histogram, I would like to fill the bars with the number of articles that covered a particular type of question (question 1 to 5, coded 1 or 0 depending on if the question is present or absent).The thing is I have 5 questions I would like to make visible in one diagram. And I don't know how to do that... I tried the fill argument and to do it with a geom_bar but failed.
Thanks so much in advance for your help
Here is a way. It's a simple bar plot with ggplot
.
This type of problems generally has to do with reshaping the data. The format should be the long format and the data is in wide format. See this post on how to reshape the data from wide to long format.
library(dplyr)
library(tidyr)
library(ggplot2)
t %>%
select(-Code) %>%
pivot_longer(
cols = starts_with("Question"),
names_to = "Question"
) %>%
mutate(Publication_date = factor(Publication_date)) %>%
ggplot(aes(Publication_date, fill = Question)) +
geom_bar() +
xlab("Publication Date")
set.seed(2021)
n <- 200
Code <- paste0("Article", 1:n)
Publication_date <- sample(2000:2020, n, TRUE)
Question <- replicate(5, rbinom(n, 1, 0.5))
colnames(Question) <- paste0("Question", 1:5)
t <- data.frame(Code, Publication_date)
t <- cbind(t, Question)