Search code examples
rplotggbiplot

how to make the biplot name more clear using ggbiplot


I have a data which can be download from here https://gist.github.com/anonymous/5f1135e4f750a39b0255

I try to plot a PCA with ggbiplot using the following function

data <- read.delim("path to the data.txt")
data.pca <- prcomp (data, center = TRUE, scale =TRUE)
library(ggbiplot)
g <- ggbiplot(data.pca, obs.scale =1, var.scale=1, ellipse = TRUE, circle=TRUE)
g <- g + scale_color_discrete(name='')
g <- g + theme(legend.direction = 'horizontal', legend.position = 'top')
print(g)

however, it is very difficult to see the biplot lines names,

is there any way to make it more clear or show it better ?


Solution

  • I think a way to make it clearer is to adjust the size and position of the labels using the varname.sizeand varname.adjust arguments. However, with a lot of variables it still looks crowded. By increasing the length of the arrows (similar to stats::biplot()), makes it look somewhat better (imo)

    # install ggbiplot
    #require(devtools)
    #install_github('ggbiplot','vqv')
    
    library(httr) 
    library(ggbiplot)
    
    # read data
    url <- "https://gist.githubusercontent.com/anonymous/5f1135e4f750a39b0255/raw/data.txt"
    dat <- read.table(text=content(GET(url), as="text"), header=TRUE)
    
    # pca 
    data.pca <- prcomp (dat, center = TRUE, scale =TRUE)
    
    # original plot + increase labels size and space from line
    p <- ggbiplot(data.pca, obs.scale=1, 
                  var.scale=1, circle=F, 
                  varname.size=4, varname.adjust=2)  
    p
    

    enter image description here

    # use coord_equal() to change size ratio of plot (excludes use of circle)
    p <- p + coord_equal(1.5) + theme_classic()
    p
    

    enter image description here

    To extend the arrows, the x and y coordinates need to be recalculated. You can then use these to edit the relevant grobs, and change any other parameter (colour, size, rotation etc). (you could go the whole ggplotGrob(p) approach, but just use grid.edit() below.)

    # function to rescale the x & y positions of the lines and labels
    f <- function(a0, a1, M=M)
          {
          l <- lapply(as.list(environment()), as.numeric)
          out <- M* (l$a1 - l$a0) + l$a0
          grid::unit(out, "native")
          }  
    
    # get list of grobs in current graphics window
    grobs <- grid.ls(print=FALSE)  
    
    # find segments grob for the arrows
    s_id <- grobs$name[grep("segments", grobs$name)]
    
    # edit length and colour of lines
    seg <- grid.get(gPath(s_id[2]))     
    grid.edit(gPath(s_id[2]),  
                x1=f(seg$x0, seg$x1, 2), 
                y1=f(seg$y0, seg$y1, 2),
                gp=gpar(col="red"))
    
    
    # find text grob for the arrow labels
    lab_id <- grobs$name[grep("text", grobs$name)]
    
    # edit position of text, and rotate and colour labels
    seg2 <- grid.get(gPath(lab_id)) 
    grid.edit(gPath(lab_id),  
                x=f(seg$x0, seg2$x, 2), 
                y=f(seg$y0, seg2$y, 2),
                rot=0,
                gp=gpar(col="red"))
    

    enter image description here

    Subjective if this makes it better, and perhaps it is easier just to use biplot() or even define a new function