How can I count number of interactions poly
will return?
If I have two variables, then the number of interactions poly
will return in function of degree is given by:
degree <- 2
dim(poly(rnorm(10), rnorm(10), degree = degree))[2]
That is the same as:
(degree^2+3*degree)/2
Is there anyway to count the number of interactions depending on the number of degree and variables (in case I use more than two)?
Math result from combinations
Suppose you have p
variables, the number of interactions associated with degree d
is computed by:
fd <- function (p, d) {
k <- choose(p, d)
if (d > 1) k <- k + p * sum(choose(p-1, 0:(d-2)))
return(k)
}
The function poly
(actually polym
in this case), with p
input variables and a degree = D
, will construct interactions from degree = 1
up to degree = D
. So the following function counts it:
fD <- function (p, D) {
if (D < 1) return(0)
component <- sapply(1:D, fd, p = p)
list(component = component, ncol = sum(component))
}
The entry component
gives the number of interaction for each degree from 1
to D
, and ncol
component gives total number of interactions.
A quick test:
a <- runif(50)
b <- runif(50)
c <- runif(50)
d <- runif(50)
X <- poly(a, b, c, d, degree = 3)
ncol(X)
# 34
fD(4, 3)
# component
# [1] 4 10 20
#
# ncol
# [1] 34
How R does this?
The first few lines of the source code for polym
explains how R addresses this problem. An expand.grid
is first called to get all possible interactions, then a rowSums
is called to compute the degree of all available interactions. Finally, a filter is applied to retain only interactions terms with degree between 1
and D
.