Search code examples
rpcadecision-tree

Objects of type prcomp not supported by autoplot


I have a data frame of multiple columns. I want to carry out PCA and classification with decision trees to check if information on il.count.ratio to une.count.ratio can actually help to differentiate the two professions.

Profession    il.count.ratio   elle.count.ratio    un.count.ratio   une.count.ratio
secretary      1.7241              1.3514             1.473            0.7380
secretary      0.0000              2.8736             2.8536           0.8370
driver         1.7065              0.0000             0.0000           0.7380
driver         0.5284              1.4733             0.7380           0.5280

output is : Error in autoplot(): ! Objects of type prcomp not supported by autoplot.

someone know what may be the problem ?

data %>%
  select(il.count.ratio:une.count.ratio) %>%
  prcomp() %>%
  autoplot(data = data,
           colour = 'Profession',
           loadings = TRUE,
           loadings.colour = "blue",
           loadings.label = TRUE,
           loadings.label.colour = "blue")

Solution

  • You could use broom's augment (for the PCs) and tidy (for the loadings) to get a dataframe before ggploting:

    library(tidyverse)
    library(broom)
    
    data <- tribble(
      ~Profession, ~il.count.ratio, ~elle.count.ratio, ~un.count.ratio, ~une.count.ratio,
      "secretary", 1.7241, 1.3514, 1.473, 0.7380,
      "secretary", 0.0000, 2.8736, 2.8536, 0.8370,
      "driver", 1.7065, 0.0000, 0.0000, 0.7380,
      "driver", 0.5284, 1.4733, 0.7380, 0.5280
    )
    
    data %>%
      select(il.count.ratio:une.count.ratio) %>%
      prcomp() %>% 
      augment(data) %>% 
      ggplot(aes(.fittedPC1, .fittedPC2, colour = Profession)) +
      geom_point()
    

      
    # Use instead of augment for the loadings
    # tidy(matrix = "loadings")
    

    Created on 2022-06-30 by the reprex package (v2.0.1)