Search code examples
rcolorsscatter-plot

R does not assign colors to plot() function


So i wanted to revise my R knowledge with what I had seen in class last year, but I came across a problem. I wanted to generate a plot for a file about flowers, with colors in the plot specific for each flower (setosa, versicolor, virginica). This worked last year, but the exact same code now gives me an error:

> irisData <- read.table("irisData2.txt", header=TRUE)

> plot(x=irisData$Sepal.Length, y=irisData$Petal.Length, col=irisData$Species)

> Error in plot.xy(xy, type, ...) : invalid color name 'setosa' 

This gave me a scatterplots with three colors, each for one species, but now it just doesn't work. How can I solve this? Note: I use a new laptop now, and I updated R to the latest version (1.4.1106). Thank you in advance.


Solution

  • You probably used to run this code on an R version <4.0 and now you’re running it on R 4.0.

    R 4.0 changed the default handling of string columns in read.table (and other functions). As a consequence, the Species column of your data.frame is now of type character, where it used to be of type factor.

    To make your code work again, change the column type:

    irisData$Species = as.factor(irisData$Species)
    

    To offer a more detailed explanation of the error message, plot’s col argument accepts either a vector of colours, or a vector of numbers (or factors, which are internally integers). If the argument is a vector of numbers, then these numbers are used to index into the currently set palette(). However, if the argument is a character string vector (as it is in your code when running on R 4.0), then these are interpreted as colour names/values instead of indices in the current palette. This fails, since the “iris” species aren’t valid colour names.