I'm hoping to find a way to for line breaks to show up while using geom_smooth() - is this possible?
Here's sample data and code I'm using and the resulting plot:
game_number <- c(1:52)
toi <- c(NA, NA, NA, NA, 20.4, 20.2, 19.4, 18.6, 17.8, 17.1, 17.7, 17.3, 16.8, 17.1, 17.8, 17.3, 16.6,
16.9, 17.4, 16.9, 16.1, 16.6, 16.9, 16.4, NA, NA, NA, NA, NA, NA, 16.9, 18.2, 18.5, 16.6, 16.3, 15.7,
15.1, 14.7, 16.5, 17.9, 16.9, NA, 17.6, 18.1, 17.9, 17.2, 18.2, 18.0, 17.3, 17.8, 18.3, 17.9)
toi_df <- tibble(player = 'Nils Lundkvist', game_number = game_number, toi = toi)
plot <- ggplot(toi_df, aes(x = game_number, y = toi, group = player, colour = player)) +
geom_line(size = 0.6) +
geom_smooth(se = F, size = 1) +
scale_y_continuous(limits = c(0, 25), expand = c(0, 0))
The resulting plot looks like this. You can see the NA line breaks in in geom_line(), but the geom_smooth() line is connecting over the NA values. Is there a way to get geom_smooth() to behave like geom_line() in this scenario? Or some other ggplot command to use instead? Thank you!
I would suggest one approach where you can compute the geom_smooth()
output in a independent dataframe and then merge with original data. Here an approach using broom
and tidyverse
packages:
library(tidyverse)
library(broom)
First the data:
#Data
game_number <- c(1:52)
toi <- c(NA, NA, NA, NA, 20.4, 20.2, 19.4, 18.6, 17.8, 17.1, 17.7, 17.3, 16.8, 17.1, 17.8, 17.3, 16.6,
16.9, 17.4, 16.9, 16.1, 16.6, 16.9, 16.4, NA, NA, NA, NA, NA, NA, 16.9, 18.2, 18.5, 16.6, 16.3, 15.7,
15.1, 14.7, 16.5, 17.9, 16.9, NA, 17.6, 18.1, 17.9, 17.2, 18.2, 18.0, 17.3, 17.8, 18.3, 17.9)
toi_df <- tibble(player = 'Nils Lundkvist', game_number = game_number, toi = toi)
Now, we compute the smooth model:
#Create smooth
model <- loess(toi ~ game_number, data = toi_df)
We create a dataframe to save the results:
#Augment model output in a new dataframe
toi_df2 <- augment(model, toi_df)
We merge the data:
#Merge data
toi_df3 <- merge(toi_df,
toi_df2[,c("player","game_number",".fitted")],
by=c("player","game_number"),all.x = T)
Finally, we plot using geom_line()
:
#Plot
ggplot(toi_df3, aes(x = game_number, y = toi, group = player, colour = player)) +
geom_line(size = 0.6) +
geom_line(aes(y=.fitted),size=1) +
scale_y_continuous(limits = c(0, 25), expand = c(0, 0))
Output:
The approach can work if you have more than one players. In that case you can group by players (group_by()
from dplyr
) and using do()
function to estimate the smooth models for each player.
Update:
I add a code for multi players. In this case I have created a function to iterate across groups defined by player in a list. After creating the function you have to use split()
to get a list with each player. The function myfunsmooth()
compute loess
. Then, you bind the data and sketch the plot. Here the code:
The dummy data:
#Data
game_number <- c(1:52)
toi <- c(NA, NA, NA, NA, 20.4, 20.2, 19.4, 18.6, 17.8, 17.1, 17.7, 17.3, 16.8, 17.1, 17.8, 17.3, 16.6,
16.9, 17.4, 16.9, 16.1, 16.6, 16.9, 16.4, NA, NA, NA, NA, NA, NA, 16.9, 18.2, 18.5, 16.6, 16.3, 15.7,
15.1, 14.7, 16.5, 17.9, 16.9, NA, 17.6, 18.1, 17.9, 17.2, 18.2, 18.0, 17.3, 17.8, 18.3, 17.9)
toi_df <- tibble(player = 'Nils Lundkvist', game_number = game_number, toi = toi)
toi_df0 <- tibble(player = 'Zach Ellenthal', game_number = game_number, toi = toi)
toi_df0$toi <- toi_df0$toi+15
toi_dfm <- rbind(toi_df,toi_df0)
The function for loess()
:
#Function for smoothing
myfunsmooth <- function(x)
{
#Model
model <- loess(toi ~ game_number, data = x)
#Augment model output in a new dataframe
y <- augment(model, x)
#Merge data
z <- merge(x,y[,c("player","game_number",".fitted")],
by=c("player","game_number"),all.x = T)
#Return
return(z)
}
Then, we create the list:
#Create list by player
List <- split(toi_dfm,toi_dfm$player)
We apply the function and bind the results in a new dataframe:
#Apply function
List2 <- lapply(List, myfunsmooth)
#Bind all
dfglobal <- do.call(rbind,List2)
rownames(dfglobal)<-NULL
Finally, we plot:
#Plot
ggplot(dfglobal, aes(x = game_number, y = toi, group = player, colour = player)) +
geom_line(size = 0.6) +
geom_line(aes(y=.fitted),size=1) +
scale_y_continuous(limits = c(0, 45), expand = c(0, 0))
Output: