Is there an easy way to determine if one vector is nested within another? In other words, in the example below, each value of bar
is associated with one and only one value of foo
, so bar
is nested within foo
.
data.frame(foo=rep(seq(4), each=4), bar=rep(seq(8), each=2))
To clarify, here is the desired result:
foo <- rep(seq(4), each=4)
bar <- rep(seq(8), each=2)
qux <- rep(seq(8), times=2)
# using a fake operator for illustration:
bar %is_nested_in% foo # should return TRUE
qux %is_nested_in% foo # should return FALSE
Suppose you have two factors f
and g
, and want to know whether g
is nested in f
.
Method 1: For people who love linear algebra
Consider the design matrix for two factors:
Xf <- model.matrix(~ f + 0)
Xg <- model.matrix(~ g + 0)
If g
is nested in f
, then the column space of Xf
must be a subspace of the column space of Xg
. In other word, for any linear combination of Xf
's columns: y = Xf %*% bf
, equation Xg %*% bg = y
can be solved exactly.
y <- Xf %*% rnorm(ncol(Xf)) ## some random linear combination on `Xf`'s columns
c(crossprod(round(.lm.fit(Xg, y)$residuals, 8))) ## least squares residuals
## if this is 0, you have nesting.
Method 2: For people who love statistics
We check contingency table:
M <- table(f, g)
If all columns have only one non-zero entry, you have g
nested in f
. In other words:
all(colSums(M > 0L) == 1L)
## `TRUE` if you have nesting
Comment: For any method, you can squeeze the code into one line easily.