Search code examples
rpcafactoextrafactominer

Troubleshooting factoextra (R)


My dataset has 25 columns, where cols 1-24 are numeric and have missing values and 25 is categorical (two factors, 'low' or 'high').

I am trying to show a PCA like this (image taken from here on sthda.com) where my variables are labeled and my two factors (low or high) are color coated/demarcated. enter image description here

First, I had to impute the missing values for my cols 1-24: res.comp <- missMDA::imputePCA(dat[,-25],ncp=5)

Then I did PCA: res.pca <- FactoMineR::PCA(res.comp$completeObs)

I try to graph and bring column 25 back in for groupin and get an error: factoextra::fviz_pca_ind(res.pca, habillage = dat[,25], addEllipses =TRUE, ellipse.level = 0.68) + scale_color_brewer(palette="Dark2")

```Error in apply(ind[, colnames(grp)], 2, as.character) : 
  dim(X) must have a positive length```

Does anyone have tips for how to get my graph working? TIA!


Solution

  • It's better if you can include a minimal reproducible example (including your data) that produces the error.

    This should mimic what your doing, but with quali.sup to indicate the index of the categorical supplementary variable.

    library(factoextra)
    library(FactoMineR)
    library(missMDA)
    
    set.seed(1)
    df <- data.frame(
      a = rnorm(100, 0, 1),
      b = rnorm(100, 0, 1),
      c = rnorm(100, 0, 1),
      d = rep(c("low", "high"), 50)
    )
    
    df$a[1] <- NA
    
    res.comp <- imputePCA(df, quali.sup = 4)
    
    pca <- PCA(res.comp$completeObs, quali.sup = 4, graph = FALSE)
    
    fviz_pca_ind(pca, habillage = 4, addEllipses = TRUE, ellipse.level = 0.68) +
      scale_color_brewer(palette = "Dark2")
    

    Created on 2024-04-10 with reprex v2.1.0