I have a project to estimate commercial real estate properties given categorical and continuous variables. I have ran a step-wise linear regression model to pick out a good formula in RStudio. (Adj. Rsq = .90 also, I know I need to do PCA and some type of categorical ANOVA test still, but I just want to get a beta estimator out before going deeper.)
How do I take the resulting output from my step(lm()) function and create an character based algebraic expression/equation with the coefficients, like:
Where M is my coefficient, and X is my variable. I know I could do it by hand in excel but with so many interactions and base variables it seems excessive. Maybe there is a function in R, or I could make a function in python that would ask what are the values for the variables needed for the calculation; but I haven't thought of any.
Thank you so much! If anything is not specific enough I will do my best to further explain.
As far as I know there is no built in function to do this (formulas can get quite messy for different model types). But if you have a simple linear model, perhaps this function can help
as_disp_eqn <- function(model, formatter=prettyNum, ...) {
model_terms <- terms(model)
model_response <- deparse(as.list(attr(model_terms, "variables")[-1])[[attr(model_terms, "response")]])
coefs <- coef(model)
rhs <- paste0(formatter(coefs, ...), ifelse(names(coefs)!="(Intercept)", paste(" *", names(coefs)),""))
paste(model_response, "=", paste(rhs, collapse=" + "))
}
You can pass in a model and get a printed display. For example
mm <- lm(formula = Sepal.Length ~ Sepal.Width + Petal.Length + Species, data = iris)
as_disp_eqn(mm, digits=2)
# [1] "Sepal.Length = 2.4 + 0.43 * Sepal.Width + 0.78 * Petal.Length + -0.96 * Speciesversicolor + -1.4 * Speciesvirginica"