For my thesis, I am making scatterplots in APA format in R. So far, my code is as follows, and it works great for plotting just one variable with confidence interval and regression line:
scatterplot=ggplot(dat, aes(x=STAIT, y=valence))+
geom_point()+
geom_smooth(method=lm,se=T, fullrange=T,colour='black')+
labs(x='STAI-T score', y='Report length')+
apatheme
However, I have two variables that were initially measured on the same 0-100 scale: valence and arousal. Instead of two seperate plots, I thought it would be nice to add both variables in a single plot, using 'valence/arousal score' as the ylab and open/closed dots to define which data points come from which variable, a bit like in this example I found online. In that example, however, the data comes from different groups. So that code doesn't work on my data. I've tried different things, and the closest I get, is with the following code:
sp.both=ggplot(dat, aes(x=STAIT))+
geom_point(aes(y=valence)) +
geom_point(aes(y=arousal)) +
apatheme
This gives me a scatterplot with data points of both of the variables added in the same plot. However, I need the data points of one score to be visually different from the other, and I want to add two seperate regression lines for each variable. But everything I've tried so far, has resulted in errors, and I cannot find any examples online of people trying to do the same thing.
Any help would be highly appreciated!
Using some random example data you could achieve your desired like so:
It's best to reshape your data to long format using e.g. tidyr::pivot_longer
which gives us two new cols, one with the names of the variables and one with the corresponding values. After reshaping you could map the values on y and set different shapes and linetypes by mapping the variables column on shape
and linetype
:
library(ggplot2)
library(tidyr)
set.seed(42)
dat <- data.frame(
STAIT = runif(20, 0, 1),
valence = runif(20, 0, 1),
arousal = runif(20, 0, 1)
)
dat_long <- dat %>%
pivot_longer(c(valence, arousal), names_to = "var", values_to = "value")
ggplot(dat_long, aes(x = STAIT, y = value, linetype = var, shape = var)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE, color = "black", size = .5)
#> `geom_smooth()` using formula 'y ~ x'