Take the following example:
fit <- lm(Sepal.Length ~ log(Sepal.Width), data = iris)
I would like a copy of iris
that only includes the variables that were involved in making fit
. I think model.matrix()
or model.frame()
don't quite do it because of the log
; they will include log(Sepal.Width)
but not Sepal.Width
. I want basically a minimal version of iris
that only includes variables that were used in making fit
. How can I do that? This of course is an example and I would like a more general solution (say you had a number of variables used in making a fit, many passed through transformations that are not necessarily invertible).
I think what you want is get_all_vars()
get_all_vars(fit, data = iris)
Output:
# Sepal.Length Sepal.Width
#1 5.1 3.5
#2 4.9 3.0
#3 4.7 3.2
#4 4.6 3.1
#5 5.0 3.6
#6 5.4 3.9
#7 4.6 3.4
# ...
This returns untransformed variables (ie, Sepal.Width
instead of log(Sepal.Width)
, as seen here:
all.equal(iris$Sepal.Width,
get_all_vars(fit, data = iris)$Sepal.Width)
#[1] TRUE