Search code examples
rggplot2forecast

`Error: Columns `y`, `colour` must be 1d atomic vectors or lists`. I have no column `colour`


I am trying to create a simple ggplot with 3 geom_lines to show normal, 5 and 10 year moving averages. My dataframe is temp with column being AverageTemperature. However I cannot understand the following errors:

Error: Columns 'y', 'colour' must be 1d atomic vectors or lists,

Error: 'mapping' must be created by 'aes()'

I have no column called y or colour and all my mapping. Other answers do not seem to explain the reason behind the errors. My code is as follows:

library(ggplot2)
library(forecast)
ma <- ma(temp$AverageTemperature, order = 5)
ma2 <- ma(temp$AverageTemperature, order = 10)

ggplot(temp, x= dt) + 
    geom_line(temp, aes(y = AverageTemperature, size = 1.5)) + scale_y_log10() + xlim(1870, 2000) +
    geom_line(temp, aes(y = ma, color = ma, size = 1.5)) +  
    geom_line(temp, aes(y = ma2, color = ma, size = 1.5)) `

My required result would look like the following graph:

https://www.datascience.com/hs-fs/hubfs/learn-data-science-forecasting-with-ARIMA-chart-3.png?width=1900&height=713&name=learn-data-science-forecasting-with-ARIMA-chart-3.png

Sample data using dput:

structure(list(dt = c(1743L, 1744L, 1745L, 1750L, 1751L, 1752L ), AverageTemperatureUncertainty = c(3.1304125, 3.0976671875, 3.00175, 3.13747272727273, 3.09229285714285, 3.06561458333333 )), row.names = c(NA, 6L), class = "data.frame")

Could someone explain what the errors are please?

Many Thanks.


Solution

  • The errors you're getting are mostly due to what is inside and outside the aes brackets. These should contain all and only the values that vary for each data point:

    • Columns 'y', 'colour' refers to your y = ma and y = ma2 parts (and color = bits that follow). In your calls above it is looking inside the dataframe temp as defined in the first ggplot bracket and is unable to find these variables.
    • You don't want the colour to vary by datapoint, so in each geom_line call take this out of the aes() bracket and set it as a constant colour.
    • size = 1.5 is also constant across variables in each geom_line call, so should be outside of aes().
    • 'mapping' must be created by 'aes()' refers probably to both your x = dt part in the ggplot bracket (this should be in an aes() bracket as well), and the call of temp in each subsequent geom_line bracket.

    It would be easier and tidier to calculate the moving averages and combine these into the original dataframe:

    library(dyplr)
    library(ggplot2)
    
    set.seed(1421)
    ma <- function(x,order=5){stats::filter(x,rep(1/order,order), sides=2)}
    
    temp<-tibble(AverageTemperature=rnorm(131, 10, 3), dt=seq(1870,2000))
    
    
    
    ma1 <- ma(temp$AverageTemperature, order = 5)
    ma2 <- ma(temp$AverageTemperature, order = 10)
    
    temp$ma1 <- ma1
    temp$ma2 <- ma2
    
    ggplot(temp, aes(x = dt)) + 
      geom_line(aes(y = AverageTemperature), color="orange", size = 1.5) + scale_y_log10() + xlim(1870, 2000) +
      geom_line(aes(y = ma1), color = "red", size = 1.5) +  
      geom_line(aes(y = ma2), color = "blue", size = 1.5)
    

    Giving the graph (from randomly generated values):

    sample graph

    (I've created ma as a new function)

    Does this help? You'll need to play around with colours etc. to suit.

    Edit: actually seeing Nelson's code above which is handier for putting in the legend. Instead of the last ggplot command in my code above you could instead do:

    temp %>% gather("id","value",c(1,3,4)) %>% 
    ggplot(aes(dt,value,col=id))+
    geom_line(size=1.5)+scale_y_log10() +
    xlim(1870, 2000)