My query is with reference to this reprex:
d1 <- data.frame(index= 1:100,x=1:100,x_hat= 1:100+ rnorm(100))
ggplot(data = d1 ) +
geom_line(aes(x=index,y=x,color="True X")) +
geom_line(aes(x=index,y=x_hat,color="Estimated X")) +
scale_x_continuous(name = "" ) +
ylab("")
The code is doing what I want it to do but I don't know how it is doing it. When I say color = "True X" I think it is generating a variable on the fly which is a constant.
Is that correct ? How is it working ? Can someone say a few words on this ? The beauty of this approach is that it automatically creates a correct legend.
Your intuition is basically correct. Specifying a string constant within each geom_line
is telling ggplot2
to draw the line in a default color and add a legend labelled with whatever string you specified after color =
. If you specified the same string in both geoms (e.g. color = "True X"
), you would get only one line in that reddish default color to go along with a legend with only one label. So in other words, each unique string constant is telling ggplot to draw the respective line in a different color and add a label to the legend.
If you want to customize further, you can add scale_color_manual
to your call to ggplot. For instance, scale_color_manual("Type of X", values = c("blue", "red"))
would add a proper title to the legend and change the colors of the two lines to whatever you want (in this case blue and red).