Search code examples
rpca

How to plot variance of Principal components in dataframe format?


I am trying to plot something like this shown below, but with dataframe object. My eigenvalues are in DF format as below. How can we get this plotted?

library("factoextra")
data(iris)
res.pca <- prcomp(iris[, -5],  scale = TRUE)
# Extract the  eigenvalues/variances
get_eig(res.pca)
fviz_eig(res.pca, geom="line")

enter image description here

DF <- as.data.frame(res.pca$x)

Solution

  • You can calculate the proportion of explained variance like this:

    library(factoextra)
    library(tidyverse)
    
    data(iris)
    res.pca <- prcomp(iris[, -5],  scale = TRUE)
    
    DF <- as.data.frame(res.pca$x)
    
    new_DF <- tibble(`Percentage of Variance Explained` = c(sd(DF$PC1)^2 / 4,
                                                  sd(DF$PC2)^2 / 4,
                                                  sd(DF$PC3)^2 / 4,
                                                  sd(DF$PC4)^2 / 4),
                     Dimensions = 1:4)
    ggplot(new_DF, aes(x = Dimensions, y = `Percentage of Variance Explained`)) +
      geom_line() +
      geom_point() +
      scale_y_continuous(labels = scales::percent_format())
    

    example.png