I have a data which can be download from here https://gist.github.com/anonymous/5f1135e4f750a39b0255
I try to plot a PCA with ggbiplot using the following function
data <- read.delim("path to the data.txt")
data.pca <- prcomp (data, center = TRUE, scale =TRUE)
library(ggbiplot)
g <- ggbiplot(data.pca, obs.scale =1, var.scale=1, ellipse = TRUE, circle=TRUE)
g <- g + scale_color_discrete(name='')
g <- g + theme(legend.direction = 'horizontal', legend.position = 'top')
print(g)
however, it is very difficult to see the biplot lines names,
is there any way to make it more clear or show it better ?
I think a way to make it clearer is to adjust the size and position of the labels using the varname.size
and varname.adjust
arguments. However, with a lot of variables it still looks crowded. By increasing the length of the arrows (similar to stats::biplot()
), makes it look somewhat better (imo)
# install ggbiplot
#require(devtools)
#install_github('ggbiplot','vqv')
library(httr)
library(ggbiplot)
# read data
url <- "https://gist.githubusercontent.com/anonymous/5f1135e4f750a39b0255/raw/data.txt"
dat <- read.table(text=content(GET(url), as="text"), header=TRUE)
# pca
data.pca <- prcomp (dat, center = TRUE, scale =TRUE)
# original plot + increase labels size and space from line
p <- ggbiplot(data.pca, obs.scale=1,
var.scale=1, circle=F,
varname.size=4, varname.adjust=2)
p
# use coord_equal() to change size ratio of plot (excludes use of circle)
p <- p + coord_equal(1.5) + theme_classic()
p
To extend the arrows, the x and y coordinates need to be recalculated. You can then use these to edit the relevant grobs, and change any other parameter (colour, size, rotation etc). (you could go the whole ggplotGrob(p)
approach, but just use grid.edit()
below.)
# function to rescale the x & y positions of the lines and labels
f <- function(a0, a1, M=M)
{
l <- lapply(as.list(environment()), as.numeric)
out <- M* (l$a1 - l$a0) + l$a0
grid::unit(out, "native")
}
# get list of grobs in current graphics window
grobs <- grid.ls(print=FALSE)
# find segments grob for the arrows
s_id <- grobs$name[grep("segments", grobs$name)]
# edit length and colour of lines
seg <- grid.get(gPath(s_id[2]))
grid.edit(gPath(s_id[2]),
x1=f(seg$x0, seg$x1, 2),
y1=f(seg$y0, seg$y1, 2),
gp=gpar(col="red"))
# find text grob for the arrow labels
lab_id <- grobs$name[grep("text", grobs$name)]
# edit position of text, and rotate and colour labels
seg2 <- grid.get(gPath(lab_id))
grid.edit(gPath(lab_id),
x=f(seg$x0, seg2$x, 2),
y=f(seg$y0, seg2$y, 2),
rot=0,
gp=gpar(col="red"))
Subjective if this makes it better, and perhaps it is easier just to use biplot()
or even define a new function