Search code examples
rlinear-regressioninterpretation

Does the sorting order matter when interpreting beta estimates in a regression model?


Seems like a very basic question but I just wanted to confirm. I'm running a multivariable linear regression model adjusted for different types of covariates (some numeric, some categorical, etc.). A sample of the model is shown below:

fit <- ols(outcome ~ exposure + age + zbmi + income + sex + ethnicity) 

Both the "outcome" and "exposure" are continuous numerical variables.

My question is, if say I run the model and the beta estimate, 95% CI, and p-value looks something like the one below:

B = -0.20 // 95%CI: [-0.50, -0.001] // p = 0.04 

Would it be appropriate to interpret this as: "For every 1 unit increase of the exposure is a 0.20 decrease in the outcome"?

What I want to know is how did it determine the order of "per 1 unit increase"? Is that just the default order of how R sorts continuous variables when running it in a regression model? Also, since both my outcome and exposure are continuous variables, does this mean that it automatically sorted these variables in ascending order (by default?) when I ran the model?

Just a bit confused on whether this sorting order matters before I run any regression model using continuous variables. Any tips / help would be appreciated!


Solution

  • Under OLS, there is no ordering or sorting of the predictors. The right hand side of the equation is summed before subtracting it from the left hand side. Then the square of this difference is minimized. So with this technique, the predictors do not have to be sorted in any way.

    For interpretation of your betas, the predictors are supposed to be independent, so it doesn't matter in which order you take them. Side note: In reality, you might get some dependence among the predictors, and this will be reflected in the standard errors being slightly larger.