So i wanted to revise my R knowledge with what I had seen in class last year, but I came across a problem. I wanted to generate a plot for a file about flowers, with colors in the plot specific for each flower (setosa, versicolor, virginica). This worked last year, but the exact same code now gives me an error:
> irisData <- read.table("irisData2.txt", header=TRUE)
> plot(x=irisData$Sepal.Length, y=irisData$Petal.Length, col=irisData$Species)
> Error in plot.xy(xy, type, ...) : invalid color name 'setosa'
This gave me a scatterplots with three colors, each for one species, but now it just doesn't work. How can I solve this? Note: I use a new laptop now, and I updated R to the latest version (1.4.1106). Thank you in advance.
You probably used to run this code on an R version <4.0 and now you’re running it on R 4.0.
R 4.0 changed the default handling of string columns in read.table
(and other functions). As a consequence, the Species
column of your data.frame is now of type character
, where it used to be of type factor
.
To make your code work again, change the column type:
irisData$Species = as.factor(irisData$Species)
To offer a more detailed explanation of the error message, plot
’s col
argument accepts either a vector of colours, or a vector of numbers (or factors, which are internally integers). If the argument is a vector of numbers, then these numbers are used to index into the currently set palette()
. However, if the argument is a character string vector (as it is in your code when running on R 4.0), then these are interpreted as colour names/values instead of indices in the current palette. This fails, since the “iris” species aren’t valid colour names.