I have a data set having 5 independent variables and 1 dependent variable. I want to know that can I apply polynomial Regression model to it. if yes then please guide me how to apply polynomial regression model to multiple independent variable in R when I don't have any prior information about the relationship between them.
Also please tell how to use predict function for this scenario?
assume that columns in my data are
ind1 ind2 ind3 ind4 ind5 dep
Here's some examples that will generate your polynomials.
# Simulate some data
ind1 <- rnorm(100)
ind2 <- rnorm(100)
ind3 <- rnorm(100)
ind4 <- rnorm(100)
ind5 <- rnorm(100)
dep <- rnorm(100, mean=ind1)
Polynomials can be defined manually using the I
function. For example a polynomial of degree 3 for ind1
will be
lm(dep ~ ind1 + I(ind1^2) + I(ind1^3))
You can also use the poly
function to generate the polynomials for you, e.g.,
lm(dep ~ poly(ind1, degree=3, raw=TRUE))
The argument raw=TRUE
is needed to get raw and not orthogonal polynomials. It doesn't impact the predictions or the fit but it does ensure that the parameter estimates are comparable.
Thus, you can fit your desired model with
lm(dep ~ poly(ind1, degree=3, raw=TRUE) +
poly(ind2, degree=3, raw=TRUE) +
poly(ind3, degree=3, raw=TRUE) +
poly(ind4, degree=3, raw=TRUE) +
poly(ind5, degree=3, raw=TRUE))
Note that it may be necessary to scale your predictors. If you measure something that results in large values then ind^3
may give you numerical problems.