I created the following plot in R with this code:
ggplot(sentiment, aes(x = year, y = nrc_sent$sentiment)) +
geom_smooth(method = "auto") + # pick a method & fit a model
scale_x_continuous(breaks = round(seq(min(sentiment$year), max(sentiment$year), by = 2),1))+
labs(x="", y="")
geom_smooth()
using method = 'loess'
(Got this message when running the code)
Where nrc_sent
represents
> nrc_sent
# A tibble: 519 x 3
sentiment state year
<dbl> <chr> <dbl>
1 152. Alabama 2007.
2 107. Alabama 2008.
3 80. Alabama 2009.
4 75. Alabama 2010.
5 173. Alabama 2011.
6 180. Alabama 2012.
7 187. Alabama 2013.
8 167. Alabama 2014.
9 124. Alabama 2015.
10 215. Alabama 2016.
# ... with 509 more rows
I am puzzled as to what the shaded area around the line represents. I looked into ggplot help page, but there does not seem to be any information that I can use in my academic article to explain what the graph represents, and what the shaded area is. I would appreciate any help with this
If you look at the documentation for geom_smooth: ?geom_smooth
, it states that the parameter se
is used to control if there is a confidence interval around the fitted line. If it is TRUE
then you are instructed to look at level
level
is the level of confidence interval to use with a default of 0.95.
My guess is this will also work for you. True playing with the level.
ggplot(sentiment, aes(x = year, y = nrc_sent$sentiment)) +
geom_smooth(method = "loess", se=TRUE,level=0.95) + # pick a method & fit a model
scale_x_continuous(breaks = round(seq(min(sentiment$year), max(sentiment$year), by = 2),1))+
labs(x="", y="")