Search code examples
rpca

Error in PCA(Table, quanti.sup =, graph = FALSE) : The following variables are not quantitative


I'm tyring to do PCA, I use the following command to do this :

res.table=PCA(table,quanti.sup=1:5,graph=FALSE)

But I got the following error :

Error in PCA(EAUX, quanti.sup = 1:5, graph = FALSE) : 
The following variables are not quantitative:  NOM
The following variables are not quantitative:  ACRO
The following variables are not quantitative:  PAYS
The following variables are not quantitative:  TYPE
The following variables are not quantitative:  PG

Here's the table : enter image description here

dput(head(table)) :

structure(list(NOM = c("Evian", "Montagne des Pyrenees", "Cristaline-St-Cyr", 
"Fiee des Lois", "Volcania", "Saint Diery"), ACRO = c("EVIAN", 
"MTPYR", "CRIST", "FIEE", "VOLCA", "STDIE"), PAYS = c("F", "F", 
"F", "F", "F", "F"), TYPE = c("M", "S", "S", "S", "S", "M"), 
    PG = c("P", "P", "P", "P", "P", "G"), CA = c(78, 48, 71, 
    89, 4.1, 85), MG = c(24, 11, 5.5, 31, 1.7, 80), `NA` = c(5, 
    34, 11.2, 17, 2.7, 385), K = c(1, 1, 3.2, 2, 0.9, 65), SUL = c(10, 
    16, 5, 47, 1.1, 25), NO3 = c(3.8, 4, 1, 0, 0.8, 1.9), HCO3 = c(357, 
    183, 250, 360, 25.8, 1350), CL = c(4.5, 50, 20, 28, 0.9, 
    285), MOY = c(60.41, 43.38, 45.86, 71.75, 4.75, 284.61)), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"))

Thank you


Solution

  • This code is applied to the sample dataset you have sent:

    library(fastDummies)
    table <- data.frame(NOM = c("Evian", "Montagne des Pyrenees", "Cristaline-St-Cyr", 
                                    "Fiee des Lois", "Volcania", "Saint Diery"), 
                            ACRO = c("EVIAN", "MTPYR", "CRIST", "FIEE", "VOLCA", "STDIE"), 
                            PAYS = c("F", "F","F", "F", "F", "F"), 
                            TYPE = c("M", "S", "S", "S", "S", "M"), 
                            PG = c("P", "P", "P", "P", "P", "G"), 
                            CA = c(78, 48, 71, 89, 4.1, 85), 
                            MG = c(24, 11, 5.5, 31, 1.7, 80), 
                            `NA` = c(5,34, 11.2, 17, 2.7, 385), 
                            K = c(1, 1, 3.2, 2, 0.9, 65), 
                            SUL = c(10,16, 5, 47, 1.1, 25), 
                            NO3 = c(3.8, 4, 1, 0, 0.8, 1.9), 
                            HCO3 = c(357,183, 250, 360, 25.8, 1350), 
                            CL = c(4.5, 50, 20, 28, 0.9,285), 
                            MOY = c(60.41, 43.38, 45.86, 71.75, 4.75, 284.61),
                        stringsAsFactors = TRUE)
    
    table_conv <- dummy_cols(table,
                             select_columns = c("NOM","ACRO","PAYS","TYPE","PG"),
                             remove_selected_columns = TRUE)
    
    pca <- prcomp(as.matrix(table_conv))
    pca
    

    First, an easy way to convert categorical variables to numbers is on-hot-encoding. You can use the library fastDummies.

    After converting, you can apply PCA using prcomp function in R. Since prcomp uses a matrix, you have to convert it before.

    Standard deviations (1, .., p=6):
    [1] 5.168512e+02 5.385374e+01 1.796098e+01 1.134412e+01 3.412325e+00 3.841269e-14
    
    Rotation (n x k) = (26 x 6):
                                        PC1           PC2          PC3          PC4
    CA                         3.406007e-02  0.4583258631 -0.449804772  0.583419015
    MG                         5.452648e-02  0.0567007399 -0.120337337 -0.431460022
    NA.                        2.857387e-01 -0.6581371706 -0.015607808 -0.028124352
    K                          4.881587e-02 -0.1004287315  0.078457844 -0.025483654
    SUL                        1.165818e-02  0.1358452663 -0.676545259 -0.582029743
    NO3                        6.980208e-05 -0.0009273267  0.013738946  0.056474181
    HCO3                       9.122393e-01  0.3046381925  0.184652450 -0.053887132
    CL                         2.046765e-01 -0.4804253636 -0.498672171  0.351834600
    MOY                        1.939708e-01 -0.0355753827 -0.185620957 -0.016150352
    NOM_Cristaline-St-Cyr     -1.441857e-04  0.0010557813  0.003263708  0.026324720
    NOM_Evian                 -7.179722e-05  0.0044191573  0.013208312 -0.001548583
    NOM_Fiee des Lois         -6.100869e-05  0.0038456894 -0.014279448 -0.018364975
    NOM_Montagne des Pyrenees -1.811771e-04 -0.0029622537 -0.012327470  0.013222722
    NOM_Saint Diery            7.681733e-04 -0.0016729014  0.001350556 -0.001045186
    NOM_Volcania              -3.100046e-04 -0.0046854729  0.008784342 -0.018588698
    ACRO_CRIST                -1.441857e-04  0.0010557813  0.003263708  0.026324720
    ACRO_EVIAN                -7.179722e-05  0.0044191573  0.013208312 -0.001548583
    ACRO_FIEE                 -6.100869e-05  0.0038456894 -0.014279448 -0.018364975
    ACRO_MTPYR                -1.811771e-04 -0.0029622537 -0.012327470  0.013222722
    ACRO_STDIE                 7.681733e-04 -0.0016729014  0.001350556 -0.001045186
    ACRO_VOLCA                -3.100046e-04 -0.0046854729  0.008784342 -0.018588698
    PAYS_F                     0.000000e+00  0.0000000000  0.000000000  0.000000000
    TYPE_M                     6.963761e-04  0.0027462560  0.014558868 -0.002593769
    TYPE_S                    -6.963761e-04 -0.0027462560 -0.014558868  0.002593769
    PG_G                       7.681733e-04 -0.0016729014  0.001350556 -0.001045186
    PG_P                      -7.681733e-04  0.0016729014 -0.001350556  0.001045186
                                       PC5          PC6
    CA                        -0.117348632  0.135395809
    MG                         0.564075505  0.597488041
    NA.                       -0.240227688 -0.095918767
    K                         -0.452908211  0.420830725
    SUL                       -0.283494587 -0.290154729
    NO3                        0.440026805 -0.480971856
    HCO3                       0.001376105 -0.077751617
    CL                         0.297037810  0.102927429
    MOY                        0.026458834  0.119000634
    NOM_Cristaline-St-Cyr     -0.074866123  0.150502225
    NOM_Evian                  0.066867097 -0.097960878
    NOM_Fiee des Lois         -0.034173834  0.021559689
    NOM_Montagne des Pyrenees  0.072334607 -0.062871374
    NOM_Saint Diery           -0.004010064  0.001954674
    NOM_Volcania              -0.026151683 -0.009820343
    ACRO_CRIST                -0.074866123  0.150502225
    ACRO_EVIAN                 0.066867097 -0.097960878
    ACRO_FIEE                 -0.034173834  0.021559689
    ACRO_MTPYR                 0.072334607 -0.062871374
    ACRO_STDIE                -0.004010064  0.001954674
    ACRO_VOLCA                -0.026151683 -0.009820343
    PAYS_F                     0.000000000  0.000000000
    TYPE_M                     0.062857033 -0.094621030
    TYPE_S                    -0.062857033  0.097787141
    PG_G                      -0.004010064  0.001954674
    PG_P                       0.004010064 -0.001822753
    

    Here you have the dummy conversion and the PCA. Apply accordingly to your data. It is advisable not to name your column NA. It's a reserved name.