Search code examples
rtidymodelsr-recipespls

Get the proportion of the variance explained in a plsda with recipes


I try to compute the proportion of the variance explained by each component in a PLSDA, using the tidymodels framework.

Here's the "gold standard" result with the mixOmics package:

library(mixOmics)

mix_plsda <- plsda(X = iris[-5], Y = iris$Species, ncomp = 4)

mix_var_expl <- mix_plsda$prop_expl_var$X
mix_var_expl
#>       comp1       comp2       comp3       comp4 
#> 0.729028323 0.227891235 0.037817718 0.005262724
sum(mix_var_expl) # check
#> [1] 1

And here with recipes::step_pls():

library(recipes)
tidy_plsda <-
  recipe(Species ~ ., data = iris) %>% 
  step_pls(all_numeric_predictors(), outcome = "Species", num_comp = 4) %>% 
  prep()

tidy_sd <- tidy_plsda$steps[[1]]$res$sd
tidy_sd
#> [1] 0.8280661 0.4358663 1.7652982 0.7622377
tidy_sd ^2 / sum(tidy_sd^2)
#> [1] 0.14994532 0.04154411 0.68145793 0.12705264

The element that looks like the most to an explained variance is sd, but as you can see, there is no obvious relationship between these two vectors.

How can I get mix_var_expl from tidy_plsda? Thanks!

Created on 2022-09-20 by the reprex package (v2.0.1)


Solution

  • The recipe object does not save the mixOmics model; just the parts that we need to process new data. The sd object is the standard deviations of the predictors. There is no current way to get what you want from the object.

    I've added a GitHub issue to add more objects to the results though: https://github.com/tidymodels/recipes/issues/1038