Search code examples
rpca

PCA analysis: getting error in dim desc(): not convenient data


I have conducted PCA on a set of data using prcomp. As a final step I am trying to use the dimdesc() function from FactoMineR to obtain p-values that identify the most significantly associated variables with my principal components.

The data frame has seven variables all of which are numerical and there are no missing values. The names are standard names such as "RCH_Home" (just in case the names could be problematic).

I write the following function:

res.desc <- dimdesc(df_PCA, axes = c(1:2), proba = 0.05)

And get the following error message: Error in dimdesc(df_PCA, axes = c(1:2), proba = 0.05) : non convenient data

Any idea what might be going on?

Thanks!!!!


Solution

  • You should use the PCA function in sostitution of the prcomp

    Below an example of PCA with FactoMineR.

    library(FactoMineR)
    library(factoextra)
    library(paran)
    data(cars)
    mtcars_pca<-cars_pca<-PCA(mtcars)
    

    If you want to check the percentage of variance, you can do this:

    mtcars_pca$eig
    
    > mtcars_pca$eig
            eigenvalue percentage of variance cumulative percentage of variance
    comp 1  6.60840025             60.0763659                          60.07637
    comp 2  2.65046789             24.0951627                          84.17153
    comp 3  0.62719727              5.7017934                          89.87332
    comp 4  0.26959744              2.4508858                          92.32421
    comp 5  0.22345110              2.0313737                          94.35558
    comp 6  0.21159612              1.9236011                          96.27918
    comp 7  0.13526199              1.2296544                          97.50884
    comp 8  0.12290143              1.1172858                          98.62612
    comp 9  0.07704665              0.7004241                          99.32655
    comp 10 0.05203544              0.4730495                          99.79960
    comp 11 0.02204441              0.2004037                         100.00000
    

    Cos2 stands for squared cosine and is an index for the quality representation of both variables and individuals. The closer this value is to one, the better the quality.

    mtcars_pca$var$cos2
             Dim.1        Dim.2       Dim.3        Dim.4        Dim.5
    mpg  0.8685312 0.0006891117 0.031962249 1.369725e-04 0.0023634487
    cyl  0.9239416 0.0050717032 0.019276287 1.811054e-06 0.0007642822
    disp 0.8958370 0.0064482423 0.002370993 1.775235e-02 0.0346868281
    hp   0.7199031 0.1640467049 0.012295659 1.234773e-03 0.0651697911
    drat 0.5717921 0.1999959326 0.016295731 1.970035e-01 0.0013361275
    wt   0.7916038 0.0542284172 0.073281663 1.630161e-02 0.0012578888
    qsec 0.2655437 0.5690984542 0.101947952 1.249426e-03 0.0060588455
    vs   0.6208539 0.1422249798 0.115330572 1.244460e-02 0.0803189801
    am   0.3647715 0.4887450097 0.026555457 2.501834e-04 0.0018011675
    gear 0.2829342 0.5665806069 0.052667265 1.888829e-02 0.0005219259
    carb 0.3026882 0.4533387304 0.175213444 4.333912e-03 0.0291718181
    

    enter image description here

    res.desc <- dimdesc(mtcars_pca, axes = c(1:2), proba = 0.05)
    
    
    
        > head(res.desc)
    $Dim.1
    $quanti
         correlation      p.value
    cyl    0.9612188 2.471950e-18
    disp   0.9464866 2.804047e-16
    wt     0.8897212 9.780198e-12
    hp     0.8484710 8.622043e-10
    carb   0.5501711 1.105272e-03
    qsec  -0.5153093 2.542578e-03
    gear  -0.5319156 1.728737e-03
    am    -0.6039632 2.520665e-04
    drat  -0.7561693 5.575736e-07
    vs    -0.7879428 8.658012e-08
    mpg   -0.9319502 9.347042e-15
    
    attr(,"class")
    [1] "condes" "list " 
    
    $Dim.2
    $quanti
         correlation      p.value
    gear   0.7527155 6.712704e-07
    am     0.6991030 8.541542e-06
    carb   0.6733043 2.411011e-05
    drat   0.4472090 1.028069e-02
    hp     0.4050268 2.147312e-02
    vs    -0.3771273 3.335771e-02
    qsec  -0.7543861 6.138696e-07
    
    attr(,"class")
    [1] "condes" "list " 
    
    $call
    $call$num.var
    [1] 1
    
    $call$proba
    [1] 0.05
    
    $call$weights
     [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    
    $call$X
                                Dim.1  mpg cyl  disp  hp drat    wt  qsec vs am gear carb
    Mazda RX4           -0.6572132031 21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
    Mazda RX4 Wag       -0.6293955058 21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
    Datsun 710          -2.7793970426 22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
    Hornet 4 Drive      -0.3117707086 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
    Hornet Sportabout    1.9744889419 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
    Valiant             -0.0561375337 18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
    Duster 360           3.0026742880 14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
    Merc 240D           -2.0553287289 24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
    Merc 230            -2.2874083842 22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
    Merc 280            -0.5263812077 19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
    Merc 280C           -0.5092054932 17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
    Merc 450SE           2.2478104359 16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
    Merc 450SL           2.0478227622 17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
    Merc 450SLC          2.1485421615 15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
    Cadillac Fleetwood   3.8997903717 10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
    Lincoln Continental  3.9541231097 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
    Chrysler Imperial    3.5929719882 14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
    Fiat 128            -3.8562837567 32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
    Honda Civic         -4.2540325032 30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
    Toyota Corolla      -4.2342207436 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
    Toyota Corona       -1.9041678566 21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
    Dodge Challenger     2.1848507430 15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2
    AMC Javelin          1.8633834347 15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2
    Camaro Z28           2.8889945733 13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
    Pontiac Firebird     2.2459189274 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
    Fiat X1-9           -3.5739682964 27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
    Porsche 914-2       -2.6512550541 26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
    Lotus Europa        -3.3857059882 30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
    Ford Pantera L       1.3729574238 15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
    Ferrari Dino        -0.0009899207 19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
    Maserati Bora        2.6691258658 15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8
    Volvo 142E          -2.4205931001 21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2
    

    it should be use the function from the same package.

    If you want to choose how many dimensions you need, you can do this by the param package

    library(paran)
    cars_paran<-paran(mtcars, graph = TRUE)
    

    enter image description here