Search code examples
rpcaggbiplot

R pca ggbiplot error : replacement has 36 rows, data has 35


I'm new to R. I was trying to use pca and ggbiplot to display the pca result but somehow stuck with some errors I could not solve. Perhaps there is a problem with my data as the code works fine with other data. I put the code and the data files I use in case you would like to recreate the scenario at the following link :-

https://drive.google.com/drive/folders/0B2jQ7Vh3S3PaZkt3Y2ZyaV9XaXc

the code : pca-plot.R data file 1 : dat1.rda (this one works fine) data file 2 : dat2.rda (this one has problem)

Appreciate any help. The error i got is at the bottom.

Thank you, --we

> g <- ggbiplot(tr.pca, obs.scale = 1, var.scale = 1, 
+               groups = Ydfall, 
+               ellipse = TRUE, 
+               circle = TRUE)
Error in `$<-.data.frame`(`*tmp*`, "groups", value = c(1L, 1L, 1L, 1L,  : 
  replacement has 36 rows, data has 35
> g <- g + scale_color_discrete(name = '')
Scale for 'colour' is already present. Adding another scale for 'colour',     which will replace the existing scale.
> g <- g + theme(legend.direction = 'horizontal', 
+                legend.position = 'top')
> g<- g+ geom_point(size=1, shape=1, color="black", stroke=2)  
> 
> print(g)
> 

Solution

  • Your dfall from dat2.rda has NA (try which(is.na(dfall), arr.ind = T)) and it causes your problem. You used na.omit() when you used prcomp() but didn't when you made Ydfall.

    Ydfall <- na.omit(dfall)[,1]   # quick fix
    
    # but if I were you, I would do first;
    dfall <- na.omit(dfall)