Is there a method for iterating data frame variables in a formula object?

In my case, I'm hoping to compute different glm and lda models for a certain subset. Y variable or output is the same in each model, but a forward best subset selection model is carried out for the variables found most significant in a random forest analysis.

However, when trying to iterate I can't find anything that could work as follows

#Ordered data frame (ordered_df_train) is just the data frame ordered using the previously mentioned #method, considering the first variable to be crim (the output)
list_formula <- vector(mode = "list", length = 13)
list_formula[[1]] <- ordered_df_train$crim ~ ordered_df_train$age
for(j in 3:14){
  list_formula[[j-1]] <- ordered_df_train$colnames(ordered_df_train)[j]
}

However,

ordered_df_train$colnames(ordered_df_train)[j]

execution reports NULL, therefore, not taking the variable expected.

Edit: As suggested, the previously used data for reproducibility is defined as:

library(MASS)
df_train <- Boston
ordered_df_train <- data.frame(
    crim = df_train$crim,
    age = df_train$age,
    nox = df_train$nox,
    tax = df_train$tax,
    indus = df_train$indus,
    dis = df_train$dis,
    rad = df_train$rad,
    black = df_train$black,
    rm = df_train$rm,
    lstat = df_train$lstat,
    zn = df_train$zn,
    ptratio = df_train$ptratio,
    medv = df_train$medv,
    chas = df_train$chas
)

Hope this allows a execution of my question. The objective is to have a list of formulas based on the forward method for best subsect selection by adding after each iteration the next most significative variable.

Solution

Currently, you are not calling colnames properly. It is a base package method and not an element of a data frame accessed with $. Even so, you need to convert string values to formula such as with as.formula.

Also, consider adjusting your call with lapply and avoid the bookkeeping of initializing a list and then iteratively assign elements by index. Use [-1] to subset out the first column name element.

list_formula <- lapply(
  colnames(ordered_df_train)[-1],
  function(col) as.formula(
    paste("ordered_df_train$crim ~ ordered_df_train$", col)
  )
)

list_formula
# [[1]]
# ordered_df_train$crim ~ ordered_df_train$age
# <environment: 0x000002842a33f240>
#   
# [[2]]
# ordered_df_train$crim ~ ordered_df_train$nox
# <environment: 0x000002842a32c270>
#   
# [[3]]
# ordered_df_train$crim ~ ordered_df_train$tax
# <environment: 0x000002843931fd10>
#   
# [[4]]
# ordered_df_train$crim ~ ordered_df_train$indus
# <environment: 0x00000284365dc340>
#   
# [[5]]
# ordered_df_train$crim ~ ordered_df_train$dis
# <environment: 0x00000284379d9800>
#   
# [[6]]
# ordered_df_train$crim ~ ordered_df_train$rad
# <environment: 0x00000284379d7fb8>
#   
# [[7]]
# ordered_df_train$crim ~ ordered_df_train$black
# <environment: 0x00000284393cf6e0>
#   
# [[8]]
# ordered_df_train$crim ~ ordered_df_train$rm
# <environment: 0x00000284379ef078>
#   
# [[9]]
# ordered_df_train$crim ~ ordered_df_train$lstat
# <environment: 0x000002843959d320>
#   
# [[10]]
# ordered_df_train$crim ~ ordered_df_train$zn
# <environment: 0x000002843959bad8>
#   
# [[11]]
# ordered_df_train$crim ~ ordered_df_train$ptratio
# <environment: 0x00000284393e4ba8>
#   
# [[12]]
# ordered_df_train$crim ~ ordered_df_train$medv
# <environment: 0x00000284366e3348>
#   
# [[13]]
# ordered_df_train$crim ~ ordered_df_train$chas
# <environment: 0x00000284364db798>

Consider also reformulate and build formula without as.formula + paste. Below will not include the data frame qualifier but you may be able to pass data frame into the data argument of your modeling method.

list_formula <- lapply(
  colnames(ordered_df_train)[-1], function(col) reformulate(col, "crim")
)

list_formula
# [[1]]
# crim ~ age
# <environment: 0x000002843a203a18>
#   
# [[2]]
# crim ~ nox
# <environment: 0x000002843a20ad68>
#   
# [[3]]
# crim ~ tax
# <environment: 0x000002843a274678>
#   
# [[4]]
# crim ~ indus
# <environment: 0x000002843a279b18>
#   
# [[5]]
# crim ~ dis
# <environment: 0x000002843a282de8>
#   
# [[6]]
# crim ~ rad
# <environment: 0x000002843a286368>
#   
# [[7]]
# crim ~ black
# <environment: 0x000002843a2898e8>
#   
# [[8]]
# crim ~ rm
# <environment: 0x000002843a28ed88>
#   
# [[9]]
# crim ~ lstat
# <environment: 0x000002843a296138>
#   
# [[10]]
# crim ~ zn
# <environment: 0x000002843a2996b8>
#   
# [[11]]
# crim ~ ptratio
# <environment: 0x000002843a29eb58>
#   
# [[12]]
# crim ~ medv
# <environment: 0x000002843a2a5f08>
#   
# [[13]]
# crim ~ chas
# <environment: 0x000002843a2a9488>