I have the following dataframe df
, where the variable types
corresponds to up to 3 types for each ID (the dataset has approximately 3000 rows):
ID types grade num
a01 a,b,c 7.1 1
a02 c,d 7.7 3
a03 c 7.3 4
a04 a,c,f 7.9 5
a05 a,c,e 6.7 3
I want to create a scatterplot, where the x axis corresponds to the num
column, the y axis corresponds to the grade
and the color of each point corresponds to its type, similar to this: https://i.sstatic.net/vWmVK.png
However, since types
has more than one value, I'm struggling to plot it. If types only had one type, I know I could simply do geom_point(aes(colour = types))
, but since it can have up to 3, I don't know how to proceed.
I like tidyr::separate_rows
which by default will split the column in question into multiple rows for each separate value it detects.
library(tidyverse)
df1 %>%
separate_rows(types) %>%
ggplot(aes(num, grade, color = types)) +
# geom_point() + # points are overplotted
geom_jitter(width = 0.1, height = 0.1)
Or more minimally:
ggplot(tidyr::separate_rows(df, types), aes(num, grade, color = types)) +
# geom_point() + # points are overplotted
geom_jitter(width = 0.1, height = 0.1)