Search code examples
rlistdataframeglm

R: Apply a list with glm-function


I'm having trouble trying to put variables, which are in list, into a glm-function. My dataframe has lots of variables, so it would be too much effort to type the independent variables one by one into glm. Lets say my dataframe is

df <- data.frame(
  id = c(1,2,3,4,5),
  `sitting position` = c("A","B","A","A","B"),
  `variable one` = c("left", "left", "right", "right", "left"),
  `variable two` = c(50, 30, 45, 80, 57),
  `variable three` = c("m","w","w","m","m"),
  check.names = FALSE)

and my list of columns which I want to use in a glm-function looks like this

columns <- dput(colnames(df))[-c(1:2)]

columns 
[1] "variable one"   "variable two"   "variable three"

Now I want to put this list directly into a glm- function, something like

glm(`sitting position` ~ columns, data = df, familiy = binomial).

instead of

glm(`sitting position` ~ `variable one` + `variable two` + `variable three`, data = df, family = binomial())

I am aware that I can't work just by adding the list, but I also can't find a solution to fix this problem.


Solution

  • Maybe we can use reformulate. reformulate will create formulas from character vectors. We can feed the output of reformulate into the formula argument to the glm function.

    I included a preliminary step to replace your column names with a cleaner and less buggy alternative with janitor::clean_names.

    library(janitor)
    
    df<-df %>% clean_names
    columns<-c('variable_one', 'variable_two', 'variable_three')
    

    And then the actual solution:

    glm(formula=reformulate(termlabels = columns, response='sitting_position'), data=df)
    

    See how reformulate works:

    reformulate(termlabels = columns, response='sitting_position')
    
    sitting_position ~ variable_one + variable_two + variable_three