I have conducted PCA on a set of data using prcomp
. As a final step I am trying to use the dimdesc()
function from FactoMineR
to obtain p-values that identify the most significantly associated variables with my principal components.
The data frame has seven variables all of which are numerical and there are no missing values. The names are standard names such as "RCH_Home" (just in case the names could be problematic).
I write the following function:
res.desc <- dimdesc(df_PCA, axes = c(1:2), proba = 0.05)
And get the following error message:
Error in dimdesc(df_PCA, axes = c(1:2), proba = 0.05) : non convenient data
Any idea what might be going on?
Thanks!!!!
You should use the PCA
function in sostitution of the prcomp
Below an example of PCA with FactoMineR
.
library(FactoMineR)
library(factoextra)
library(paran)
data(cars)
mtcars_pca<-cars_pca<-PCA(mtcars)
If you want to check the percentage of variance, you can do this:
mtcars_pca$eig
> mtcars_pca$eig
eigenvalue percentage of variance cumulative percentage of variance
comp 1 6.60840025 60.0763659 60.07637
comp 2 2.65046789 24.0951627 84.17153
comp 3 0.62719727 5.7017934 89.87332
comp 4 0.26959744 2.4508858 92.32421
comp 5 0.22345110 2.0313737 94.35558
comp 6 0.21159612 1.9236011 96.27918
comp 7 0.13526199 1.2296544 97.50884
comp 8 0.12290143 1.1172858 98.62612
comp 9 0.07704665 0.7004241 99.32655
comp 10 0.05203544 0.4730495 99.79960
comp 11 0.02204441 0.2004037 100.00000
Cos2 stands for squared cosine and is an index for the quality representation of both variables and individuals. The closer this value is to one, the better the quality.
mtcars_pca$var$cos2
Dim.1 Dim.2 Dim.3 Dim.4 Dim.5
mpg 0.8685312 0.0006891117 0.031962249 1.369725e-04 0.0023634487
cyl 0.9239416 0.0050717032 0.019276287 1.811054e-06 0.0007642822
disp 0.8958370 0.0064482423 0.002370993 1.775235e-02 0.0346868281
hp 0.7199031 0.1640467049 0.012295659 1.234773e-03 0.0651697911
drat 0.5717921 0.1999959326 0.016295731 1.970035e-01 0.0013361275
wt 0.7916038 0.0542284172 0.073281663 1.630161e-02 0.0012578888
qsec 0.2655437 0.5690984542 0.101947952 1.249426e-03 0.0060588455
vs 0.6208539 0.1422249798 0.115330572 1.244460e-02 0.0803189801
am 0.3647715 0.4887450097 0.026555457 2.501834e-04 0.0018011675
gear 0.2829342 0.5665806069 0.052667265 1.888829e-02 0.0005219259
carb 0.3026882 0.4533387304 0.175213444 4.333912e-03 0.0291718181
res.desc <- dimdesc(mtcars_pca, axes = c(1:2), proba = 0.05)
> head(res.desc)
$Dim.1
$quanti
correlation p.value
cyl 0.9612188 2.471950e-18
disp 0.9464866 2.804047e-16
wt 0.8897212 9.780198e-12
hp 0.8484710 8.622043e-10
carb 0.5501711 1.105272e-03
qsec -0.5153093 2.542578e-03
gear -0.5319156 1.728737e-03
am -0.6039632 2.520665e-04
drat -0.7561693 5.575736e-07
vs -0.7879428 8.658012e-08
mpg -0.9319502 9.347042e-15
attr(,"class")
[1] "condes" "list "
$Dim.2
$quanti
correlation p.value
gear 0.7527155 6.712704e-07
am 0.6991030 8.541542e-06
carb 0.6733043 2.411011e-05
drat 0.4472090 1.028069e-02
hp 0.4050268 2.147312e-02
vs -0.3771273 3.335771e-02
qsec -0.7543861 6.138696e-07
attr(,"class")
[1] "condes" "list "
$call
$call$num.var
[1] 1
$call$proba
[1] 0.05
$call$weights
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
$call$X
Dim.1 mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 -0.6572132031 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag -0.6293955058 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 -2.7793970426 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive -0.3117707086 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 1.9744889419 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
Valiant -0.0561375337 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
Duster 360 3.0026742880 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
Merc 240D -2.0553287289 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
Merc 230 -2.2874083842 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
Merc 280 -0.5263812077 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
Merc 280C -0.5092054932 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
Merc 450SE 2.2478104359 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
Merc 450SL 2.0478227622 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
Merc 450SLC 2.1485421615 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
Cadillac Fleetwood 3.8997903717 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
Lincoln Continental 3.9541231097 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
Chrysler Imperial 3.5929719882 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
Fiat 128 -3.8562837567 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
Honda Civic -4.2540325032 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
Toyota Corolla -4.2342207436 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
Toyota Corona -1.9041678566 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
Dodge Challenger 2.1848507430 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
AMC Javelin 1.8633834347 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
Camaro Z28 2.8889945733 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
Pontiac Firebird 2.2459189274 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
Fiat X1-9 -3.5739682964 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
Porsche 914-2 -2.6512550541 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
Lotus Europa -3.3857059882 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
Ford Pantera L 1.3729574238 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
Ferrari Dino -0.0009899207 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
Maserati Bora 2.6691258658 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
Volvo 142E -2.4205931001 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
it should be use the function from the same package.
If you want to choose how many dimensions you need, you can do this by the param
package
library(paran)
cars_paran<-paran(mtcars, graph = TRUE)