Dataset I am working on looks like-DATA there are 6 different countries and r_1..r_13 specifies the reasons. I want to apply PCA on this dataset to find out the significant reasons for each country The question I want to ask is how can I run PCA for each country without reading file for each country instead I want to read the entire file as shown above. Also please check the code I am using for doing PCA-
pca<-prcomp(numeric,center=T,scale=T)
summary(pca)
eigen_val<-pca$sdev ^2
sum(eigen_val)
prop_var<-round(eigen_val/sum(eigen_val),4)
round(sum(prop_var[1:13]),4)
load<-pca$rotation
After computing rotation matrix I will check which PC's are most correlated with which observed variables and accordingly I will decide the significance of the variables.(on the basis of- more than no. of PC's correlated with variable more is the significance of the variable) Kindly suggest whether the approach is correct or not ! Thanks!!
Here's a simple starting point for a solution that you can tweak to get the results in your desired format. Let's assume you're working with the iris
dataset in R
, and you want to do pca
for each Species
, kind of like how you want to do pca
by each country in your data.
library(caret)
data(iris)
Iris <- split(iris, iris$Species)
for(i in 1:length(Iris)){
assign(paste0("pca", i), prcomp(Iris[[i]][which(names(iris)!="Species")], center=T, scale.=T))
}