Search code examples
rtime-seriesdata-visualizationlinear-regressiontrendline

I am trying to add a smooth trend line using linear regression, Help me i have time series data


ggplot()+
  geom_line(data=combined,aes(x=reorder(dates,value),y=value,color=variable,group=variable))+
  scale_y_continuous(labels=function(x) format(x,scientific=FALSE))+
  theme_gray()+
  theme(axis.text.x = element_text(angle=90),
        plot.title=element_text(hjust=0.5),plot.subtitle = 
          element_text(hjust = 0.5))+
  annotate(geom = 'text',x='03.5.20',y=150000,label=
             'WHO declares Covid-19 a Pandemic')+
  annotate(geom = 'point',x='03.11.20',y=125865,size=6,shape=21,fill='blue')+
  labs(title='Cases in China vs World ',x='Daily trend from January to March',y='Case Numbers ',
       subtitle = 'Data From Jan 22,2020 - Mar 23,2020')

This is my regular table that works, now i am trying to add a smooth trend line using linear regression. I tried using stat_smooth(method='lm',formula=?) but the example i am working with uses y-x.

my problem is that on my x axis, I have dates, i am not sure where to go from here.

This is the data i am using

dates variable value
1  01.22.20    World   555
2  01.23.20    World   653
3  01.24.20    World   941
4  01.25.20    World  1434
5  01.26.20    World  2118
6  01.27.20    World  2927
7  01.28.20    World  5578
8  01.29.20    World  6166
9  01.30.20    World  8234
10 01.31.20    World  9927
63  01.22.20    China    548
64  01.23.20    China    643
65  01.24.20    China    920
66  01.25.20    China   1406
67  01.26.20    China   2075
68  01.27.20    China   2877
69  01.28.20    China   5509
70  01.29.20    China   6087
71  01.30.20    China   8141
72  01.31.20    China   9802

Any tips on how to approach this would be appreciated.


Solution

  • You can use geom_smooth, and if you do not want to use color in the smooth you can set the aesthetics only to geom_line

    Data

    df <- structure(list(dates = structure(c(18283, 18284, 18285, 18286, 
    18287, 18288, 18289, 18290, 18291, 18292, 18283, 18284, 18285, 
    18286, 18287, 18288, 18289, 18290, 18291, 18292), class = "Date"), 
        variable = c("World", "World", "World", "World", "World", 
        "World", "World", "World", "World", "World", "China", "China", 
        "China", "China", "China", "China", "China", "China", "China", 
        "China"), value = c(555L, 653L, 941L, 1434L, 2118L, 2927L, 
        5578L, 6166L, 8234L, 9927L, 548L, 643L, 920L, 1406L, 2075L, 
        2877L, 5509L, 6087L, 8141L, 9802L)), class = "data.frame", row.names = c(NA, 
    -20L))
    

    Code

    df %>% 
      ggplot(aes(x=reorder(dates,value),y=value,color=variable,group=variable))+
      geom_line()+
      geom_smooth(method = "lm", se = FALSE)
    

    enter image description here

    General smooth

    df %>% 
      ggplot(aes(x=reorder(dates,value),y=value))+
      geom_line(aes(color=variable,group=variable))+
      geom_smooth(aes(group = 1),method = "lm", se = FALSE)
    

    enter image description here