I have a data frame of multiple columns. I want to carry out PCA and classification with decision trees to check if information on il.count.ratio to une.count.ratio can actually help to differentiate the two professions.
Profession il.count.ratio elle.count.ratio un.count.ratio une.count.ratio
secretary 1.7241 1.3514 1.473 0.7380
secretary 0.0000 2.8736 2.8536 0.8370
driver 1.7065 0.0000 0.0000 0.7380
driver 0.5284 1.4733 0.7380 0.5280
output is : Error in autoplot()
:
! Objects of type prcomp not supported by autoplot.
someone know what may be the problem ?
data %>%
select(il.count.ratio:une.count.ratio) %>%
prcomp() %>%
autoplot(data = data,
colour = 'Profession',
loadings = TRUE,
loadings.colour = "blue",
loadings.label = TRUE,
loadings.label.colour = "blue")
You could use broom's augment
(for the PCs) and tidy
(for the loadings) to get a dataframe before ggplot
ing:
library(tidyverse)
library(broom)
data <- tribble(
~Profession, ~il.count.ratio, ~elle.count.ratio, ~un.count.ratio, ~une.count.ratio,
"secretary", 1.7241, 1.3514, 1.473, 0.7380,
"secretary", 0.0000, 2.8736, 2.8536, 0.8370,
"driver", 1.7065, 0.0000, 0.0000, 0.7380,
"driver", 0.5284, 1.4733, 0.7380, 0.5280
)
data %>%
select(il.count.ratio:une.count.ratio) %>%
prcomp() %>%
augment(data) %>%
ggplot(aes(.fittedPC1, .fittedPC2, colour = Profession)) +
geom_point()
# Use instead of augment for the loadings
# tidy(matrix = "loadings")
Created on 2022-06-30 by the reprex package (v2.0.1)