Search code examples
rautomationhardcode

Iterate models without hardcoding it with R


I have this code

#  [1] "X6"  "X5"  "X7"  "X4"  "X3"  "X8"  "X19" "X9"  "X10" "X16"    
formula = result ~ X3+X4+X5+X6+X7+X8+X9+X10+X16+X19
full_magnetude_model = glm.fit <- glm(formula, data = train)
full_magnetude_predict = predict(full_magnetude_model, newdata=test)

# Comparing results
full_magnetude_results <- ifelse(full_magnetude_predict > 0.5, 1, 0)
true_results = test$result

# results
table(full_magnetude_results,true_results)

It is working properly but the results are functuating for different formulas and I need do the same for:

#  [1] "X6"  "X5"  "X7"  "X4"  "X3"  "X8"  "X19" "X9"  "X10" "X16"    
#  [1] "X6"  "X5"  "X7"  "X4"  "X3"  "X8"  "X19" "X9"  "X10"    
#  [1] "X6"  "X5"  "X7"  "X4"  "X3"  "X8"  "X19" "X9"      
#  [1] "X6"  "X5"  "X7"  "X4"  "X3"  "X8"  "X19"     
#  [1] "X6"  "X5"  "X7"  "X4"  "X3"  "X8"      

and so on, I can manualy do this but is there are smart way to do it?

Update

full code: https://github.com/martin-varbanov96/fmi_summer_2018/blob/master/fmi_6ti_sem/Pril_stat/project/main.R

the idea is to make a list of formulas and apply my code for each element of the list


Solution

  • Probably not a full answer but this should give an idea:

    terms=c("X3", "X4", "X5", "X6", "X7", "X8", "X9", "X10", "X16", 
            "X19")
    
    lapply(10:6,function(x) {
      formula <- as.formula(paste("result ~ ", paste0(terms[1:x],collapse="+")))
      formula
    })
    

    Gives:

    [[1]]
    result ~ X3 + X4 + X5 + X6 + X7 + X8 + X9 + X10 + X16 + X19
    <environment: 0x0000000016fbec50>
    
    [[2]]
    result ~ X3 + X4 + X5 + X6 + X7 + X8 + X9 + X10 + X16
    <environment: 0x0000000016fc29a0>
    
    [[3]]
    result ~ X3 + X4 + X5 + X6 + X7 + X8 + X9 + X10
    <environment: 0x00000000170f44b0>
    
    [[4]]
    result ~ X3 + X4 + X5 + X6 + X7 + X8 + X9
    <environment: 0x00000000170f8a98>
    
    [[5]]
    result ~ X3 + X4 + X5 + X6 + X7 + X8
    <environment: 0x00000000170fd1d0>
    

    Lapply will iterate as much as the range (10 to 6) passing that to the anonymous function, this x will be used to select the needed terms.

    The idea is to build your formulas from the needed terms, here removing one each time, pasting them as show in ?as.formula documentation and getting the formula, the rest of your code can be used as is, the resulting list will contain tables instead of the formulas in this example.