I have produced a PCA plot, where I plot a number of cells based on their expression of various genes. In this plot, I want to color some of the points in a separate color. I tried to achieve this by creading "groups", where I sort the cells based on their expression or lack of expression of "gene1".
Here's what my data frame looks (gene1, gene2 and cell_1, cell_2 etc. are colnames and rownames):
gene1 gene2 gene3 gene4 gene5
cell_1 0.0000 0.279204 25.995400 46.171700 94.234100
cell_2 0.0000 23.456000 77.339800 194.241000 301.234000
cell_3 2.0000 13.100000 45.309200 0.776565 0.000000
cell_4 0.0000 10.500000 107.508000 3.032500 0.000000
cell_5 3.0000 0.000000 0.266139 0.762981 123.371000
Here's the code I use to try to achieve this:
library(ggplot2)
library(ggfortify)
# Group cells based on expression of a certain gene (to use for color labels in the next step)
groups <- factor(ifelse(df$gene1 > 0, "Positive", "Others"))
#Calculate PCs and plot PCA
autoplot(prcomp(log(df[]+1)), colour="Positive")
When I run this code, I get the following error:
Error in grDevices::col2rgb(colour, TRUE) : invalid color name 'Positive'
How about this?
df$groups <- factor(ifelse(df$gene1 > 0, "Positive", "Others"))
head(df)
gene1 gene2 gene3 gene4 gene5 groups
1 0.5638534 8.968558 94.40170 62.93106 290.442698 Positive
2 0.0000000 15.248374 45.87507 204.21703 291.501669 Others
3 1.9059518 19.488162 75.89302 97.69643 177.833347 Positive
4 1.9449987 6.358773 54.97159 41.54307 164.835188 Positive
5 0.0000000 16.568077 31.62370 23.72278 31.774541 Others
6 1.7199368 3.788276 80.51450 102.82221 6.259461 Positive
autoplot(prcomp(log(df[1:5]+1)), data=df, colour='groups')