I am quite new to R and I find myself struggling with the following Issue.
I want to create a barplot with 2 bars that visualizes the time to metastasis in a cohort of 100 patients out which 40 developed metastasis. One bar represents the patients that have received immunotherapy and the 2. bar represents the patients that did not reiceive immunotherapy.
I created an excel_file with the following columns:
Basically, I have to
Filter only the patients who developed metastasis (1)
Create a barplot:
y-axis: TTM (time to metastasis)
x-axis: Immunotherapy (1 bar - yes, 1 bar - no)
I'd be thankful for any help,
kind regards
ggplot(data = M1, aes (x = Immunotherapy, y = Time))+ geom_col()
Update 2: See OP question2 in comments:
We could use facet_wrap
. I added a column with Cancer_Type to M1
:
The new data:
M1 <- structure(list(Metastasis = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1), time = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 3, 5, 1, 3, 5, 8, 6, 7, 10, 7, 9, 9, 9, 3,
2, 5, 9, 4, 9, 6, 8, 1, 6, 9, 6, 10, 6, 6, 4, 7, 6, 10, 5, 5,
2, 9, 6, 1, 1, 2), Immunotherapy = c(0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1), Cancer_Type = c("Colon cancer", "Lung cancer",
"Colon cancer", "Lung cancer", "Colon cancer", "Colon cancer",
"Colon cancer", "Lung cancer", "Colon cancer", "Colon cancer",
"Colon cancer", "Lung cancer", "Colon cancer", "Lung cancer",
"Lung cancer", "Colon cancer", "Lung cancer", "Colon cancer",
"Colon cancer", "Lung cancer", "Colon cancer", "Lung cancer",
"Lung cancer", "Colon cancer", "Colon cancer", "Lung cancer",
"Lung cancer", "Colon cancer", "Lung cancer", "Lung cancer",
"Lung cancer", "Lung cancer", "Lung cancer", "Colon cancer",
"Colon cancer", "Colon cancer", "Colon cancer", "Lung cancer",
"Colon cancer", "Lung cancer", "Lung cancer", "Colon cancer",
"Colon cancer", "Colon cancer", "Lung cancer", "Lung cancer",
"Colon cancer", "Colon cancer", "Colon cancer", "Lung cancer",
"Lung cancer", "Colon cancer", "Lung cancer", "Colon cancer",
"Colon cancer", "Lung cancer", "Lung cancer", "Lung cancer",
"Colon cancer", "Lung cancer", "Colon cancer", "Lung cancer",
"Lung cancer", "Lung cancer", "Colon cancer", "Colon cancer",
"Lung cancer", "Lung cancer", "Colon cancer", "Lung cancer",
"Colon cancer", "Lung cancer", "Colon cancer", "Colon cancer",
"Colon cancer", "Colon cancer", "Lung cancer", "Colon cancer",
"Lung cancer", "Colon cancer", "Colon cancer", "Colon cancer",
"Lung cancer", "Colon cancer", "Colon cancer", "Colon cancer",
"Colon cancer", "Lung cancer", "Lung cancer", "Lung cancer",
"Colon cancer", "Colon cancer", "Colon cancer", "Lung cancer",
"Lung cancer", "Lung cancer", "Lung cancer", "Colon cancer",
"Colon cancer", "Lung cancer")), row.names = c(NA, -100L), class = "data.frame")
The code with facet_wrap()
M1 %>%
mutate(Immunotherapy = factor(as.character(Immunotherapy), labels = c("no", "yes"))) %>%
filter(Metastasis >=1) %>%
ggplot(aes(x = Immunotherapy, y = time, fill = Immunotherapy)) +
geom_boxplot(outlier.shape = NA, alpha = 0.8, color = "black") +
geom_jitter(width = 0.2, alpha = 0.5, size = 3) +
scale_fill_manual(values = c("#E69F00", "#56B4E9")) +
labs(x = "Immunotherapy", y = "Time to metastasis (months)") +
facet_wrap(~Cancer_Type)+
theme_classic()+
scale_y_continuous(breaks = seq(0, ceiling(max(M1$time)), by = 2))
Update 1: OP question see comments:
Question 1:
Question 2: I think you mean Immunotherapy and not metastasis:
With this line: mutate(Immunotherapy = factor(as.character(Immunotherapy), labels = c("no", "yes")))
We transform Immunotherapy (0,1) from numeric to factor. A factor variable has levels and labels. So we can assign to the factor Immunotherapy labels like we did in the above line e.g. "no" and "yes". And now is the key moment it is the order of the natural numbers. So 0 is before 1 and therefore "no" (which is in the line above before "yes") and "yes" is 1.
First answer We could do something like this:
library(dplyr)
library(ggplot2)
# create fake data
M1 <- data.frame(
Metastasis = c(rep(0, 60), rep(1, 40)),
time = c(rep(0, 60), sample(1:10, 40, replace = TRUE)),
Immunotherapy = c(rep(0, 70), rep(1, 30))
)
M1 %>%
mutate(Immunotherapy = factor(as.character(Immunotherapy), labels = c("no", "yes"))) %>%
filter(Metastasis >=1) %>%
ggplot(aes(x = Immunotherapy, y = time, fill = Immunotherapy)) +
geom_boxplot(outlier.shape = NA, alpha = 0.8, color = "black") +
geom_jitter(width = 0.2, alpha = 0.5, size = 3) +
scale_fill_manual(values = c("#E69F00", "#56B4E9")) +
labs(x = "Immunotherapy", y = "Time to metastasis (months)") +
theme_classic()+
scale_y_continuous(breaks = seq(0, ceiling(max(M1$time)), by = 2))