I'm trying to realize a nice plot of PC together with the cumulative variance explained. The dataframe I'm working on is available at https://www.kaggle.com/miroslavsabo/young-people-survey?select=responses.csv
df.responses <- read.csv("Data/responses.csv")
pref <- colnames(df.responses[0:63]) #columns for Music, Movies and Hobbies preferences
for(i in 1:length(pref)){
df.responses[is.na(df.responses[,i]), i] <- median(df.responses[,i], na.rm = TRUE)
}
df.movies <- data.frame(df.responses[20:31])
Above I just loaded the df, removed the na for the cols I'm interested in and selected the subset I want to PCA.
library(ggplot2)
library(factoextra)
pca.movies <- prcomp(df.movies, scale = TRUE,)
pca.movies$rotation <- -pca.movies$rotation
pca.movies$x <- -pca.movies$x
fviz_pca_var(pca.movies,
col.var = "contrib",
gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
repel = TRUE
)
pv.movies <- pca.movies$sdev^2
pvp.movies <- pv.movies/sum(pv.movies)
pvp.movies
fviz_eig(pca.movies,
addlabels = T,
barcolor = "#E7B800",
barfill = "#E7B800",
linecolor = "#00AFBB",
choice = "variance",
ylim=c(0,25))
plot(cumsum(pvp.movies),xlab = "Cumulative proportion of Variance Explained", ylim=c(0,1),type = 'b')
With the above I managed to obtain two nice plots for PCA, I would like to add to the second plot the line of cumulative sum (the one showed in the third ugly plot) Is there a way to add such line to the fviz_eig plot? I know this PCA is not really efficient, I'm just challenging myself with some dataviz.
The object returned by fviz_eig
is a ggplot
object, thus you can merge the two plots as follows:
p <- fviz_eig(pca.movies,
addlabels = T,
barcolor = "#E7B800",
barfill = "#E7B800",
linecolor = "#00AFBB",
choice = "variance",
ylim=c(0,25))
df <- data.frame(x=1:length(pvp.movies),
y=cumsum(pvp.movies)*100/4)
p <- p +
geom_point(data=df, aes(x, y), size=2, color="#00AFBB") +
geom_line(data=df, aes(x, y), color="#00AFBB") +
scale_y_continuous(sec.axis = sec_axis(~ . * 4,
name = "Cumulative proportion of Variance Explained") )
print(p)