I am beginner in r and I faced to two problems here. I am grateful if anyone can help me.
Here is my code:
library(ggplot)
ds <- as.data.frame(Titanic)
color_survived = "#FFA500"
color_dead = "#0000FF"
ds$Sex <- as.factor(ds$Sex)
ds$Survived <- as.factor(ds$Survived)
categorical.ds <- ds %>%
select(Sex,
Class,
Survived) %>%
gather(key = "key", value = "value", -Survived)
categorical.ds %>%
ggplot(ds ,aes(value)) +
geom_bar(aes(x = value,
fill = Survived),
alpha = .2,
position = "dodge",
color = "black",
width = .7) +
labs(x = "",
y = "") +
theme(
axis.text.y = element_blank(),
axis.ticks.y = element_blank()) +
facet_wrap(~ key, scales = "free", nrow = 1) +
scale_fill_manual(values = c(color_survived, color_dead), name = "Survived", labels = c("Survived", "Dead"))
Here is the plot:
Thank you very much!
here is something, hope it helps!
long story short:
gather
(if you really want to, pivot_longer
is more recommended). I used group_by
and summarize
instead, which is what you need for aggregations.patchwork
loaded.geom_col
is what you want, geom_bar
is plotting countsgeom_text
welcome in R commmunity!
##we need the tidyverse package for data manipulation and ggplot
library(tidyverse)
##since we'll make 2 different graphs, patchwork will allow us to combine them
library(patchwork)
#data reproduction
ds <- data.frame(Sex = sample(c("male", "female"), size = 100, replace = TRUE),
Survived = sample(c("survived", "dead"), size = 100, replace = TRUE),
Class = sample(c("1st", "2nd", "3rd", "crew"), size = 100, replace = TRUE))
#I start with classes : group_by allows us to compute the rate of each group
#summarize uses these group to compute the rate : number of survivors / number in the group (and not NA)
class.ds <- ds %>%
select(Class, Survived) %>%
group_by(Class) %>%
summarize(surv_rate = sum(Survived == "survived") / sum(!is.na(Survived)))
#we do the same for the sex
sex.ds <- ds %>%
select(Sex, Survived) %>%
group_by(Sex) %>%
summarize(surv_rate = sum(Survived == "survived") / sum(!is.na(Survived)))
#now the plots : class is our x axis, the rate is the y axis.
#geom_col is used for the bars, geom_text is for the labels
#of course, you can then add color etc, here I keep it simple
class.plot <- class.ds %>% ggplot(aes(Class, surv_rate)) +
geom_col() +
geom_text(aes(label = round(surv_rate, 2)), nudge_y = 0.02)
#same thing for the sex
sex.plot <- sex.ds %>% ggplot(aes(Sex, surv_rate)) +
geom_col() +
geom_text(aes(label = round(surv_rate, 2)), nudge_y = 0.02)
#now we just need to group the graphs with patchowork
class.plot + sex.plot