I want to plot the regression lines for each city in a scatterplot.
The dataframe looks like:
df
City str testscr
19 Los Angeles 22.70402 619.80
31 San Diego 20.60697 624.55
33 Los Angeles 21.53581 625.30
35 San Bernardino 21.19407 626.10
36 Los Angeles 21.86535 626.80
45 Riverside 19.26697 628.75
46 Los Angeles 23.30189 629.80
63 Orange 21.94756 633.15
67 Los Angeles 20.68242 634.05
69 San Diego 21.78650 634.10
72 Los Angeles 21.15289 634.40
76 San Bernardino 18.98373 634.95
86 San Bernardino 19.30676 636.60
87 Riverside 20.89231 636.70
105 San Bernardino 19.75422 639.35
114 Orange 19.62662 640.75
118 San Diego 20.08452 641.45
126 Riverside 22.81818 643.20
128 Los Angeles 21.37363 643.40
146 San Diego 19.79654 645.55
156 Orange 21.04869 646.70
157 Orange 20.17544 646.90
160 San Diego 20.29137 647.25
168 San Diego 17.15328 648.70
169 San Bernardino 22.34977 648.95
170 Orange 22.17007 649.15
191 Orange 23.01438 652.10
200 Riverside 21.03721 653.40
My Approach was:
ggplot(data=df,aes(x=str,y=testscr))+
geom_point()+
geom_smooth(method="lm",se=FALSE)+
facet_grid(. ~City)
Is there a smarter or better way? And how can I add the slope coefficient to every Regression line?
Let's deal with groups first, then answer the second part about adding labels.
If you want to plot by group, there are basically two options. The first is to facet, as you have. The second is to group the points, either explicitly using aes(group = City)
, or by another aesthetic such as aes(color = City)
.
If the second approach generates a messy plot, for example with lots of overlapping lines, then it's best to go with facets.
A couple of examples using the iris
dataset.
First, grouping by color:
library(ggplot2)
iris %>%
ggplot(aes(Petal.Length, Sepal.Length)) +
geom_point(aes(color = Species)) +
geom_smooth(method = "lm",
aes(color = Species),
se = FALSE)
Group by group:
iris %>%
ggplot(aes(Petal.Length, Sepal.Length)) +
geom_point(aes(group = Species)) +
geom_smooth(method = "lm",
aes(color = Species),
se = FALSE)
Use facets:
iris %>%
ggplot(aes(Petal.Length, Sepal.Length)) +
geom_point() +
geom_smooth(method = "lm",
se = FALSE) +
facet_wrap(~Species)
For adding labels such as the coefficients, look at the ggpmisc package. Here is one way to add the coefficients using stat_fit_tb
:
iris %>%
ggplot(aes(Petal.Length, Sepal.Length)) +
geom_point() +
geom_smooth(method = "lm",
se = FALSE) +
facet_wrap(~Species) +
stat_fit_tb(method = "lm",
tb.type = "fit.coefs")