Search code examples
rerror-handlingdatasetlinear-regressionlm

How can I fix this invalid type error using lm()?


Error in model.frame.default(formula = data$conservationstatus ~ data$latitude,  : 
  invalid type (NULL) for variable 'data$conservationstatus

I have a dataset (called data, after reading a CSV file), and it has the columns Conservation Status and Latitude. I'm trying to perform linear regression on these two using

lm(data$ConservationStatus ~ data$Latitude, data = data)

However, I keep getting the error above. It seems like it's because my column has two words in in it. I've tried data$Conservation Status, data$'Conservation Status', data$Conservation.Status, but nothing seems to work :(


Solution

  • We can specify the formula without data$. If the column name have spaces, use backquotes to wrap the column name

    model <- lm(`Conservation Status` ~ Latitude, data = data)
    

    It can be reproduced with a simple example

    data(iris)
    lm(iris$epal.Length ~ iris$Species, iris)
    

    Error in model.frame.default(formula = iris$epal.Length ~ iris$Species, : invalid type (NULL) for variable 'iris$epal.Length'

    and using the correct syntax

    lm(Sepal.Length ~ Species, iris)
    
    #Call:
    #lm(formula = Sepal.Length ~ Species, data = iris)
    
    #Coefficients:
    #      (Intercept)  Speciesversicolor   Speciesvirginica  
    #            5.006              0.930              1.582