Search code examples
rchartserrorbar

Error bar chart base R


I'm new to R. For an assignment I have to create a grouped bar chart with error bars in base R (so no packages allowed), with lattice and with ggplot2. For the base R graph I made up some data and I tried to produce a simple bar chart like this:

San_Diego <- c(65,20,74)
Rosarito <- c(34,35,23)
La_Paz <- c(21,71,28)
Mating_strategy <- c("Ultradominant","Dominant","Sneaker")
col <- c("darkorange1","skyblue3","gold2")        

lizards <- data.frame(row.names=Mating_strategy, San_Diego, 
                  Rosarito, La_Paz)
lizards.matrix <- as.matrix(lizards)

barplot(lizards.matrix,
        beside=T,
        col=col,
        ylim=c(0,80),
        xlab="Site",ylab="Frequency",
        legend.text=row.names(lizards.matrix),
        args.legend=list(x="top",bty="n"),
        las=1,
        cex.axis=1.2)

But now I'm stuck trying to add error bars to my chart. I tried to do it as described here (http://sickel.net/blogg/?p=1284) but I don't really understand what they are doing and when I tried it, it produced an entirely different, unusable graph. I also found this solution (http://imgur.com/126hJSI) online but I don't understand where I'm supposed to get those ucl and lcl values for my data, so that didn't work out so well either.

I'm afraid I have a LOT to learn but I hope someone here can help me out a bit.

Thanks in advance,

Marlies


Solution

  • Following http://sickel.net/blogg/?p=1284, I added error bars to your bar plot as follows.

    First, I run the code that defines the sample data (i. e. everything until the line that defines lizard.matrix. After that, the plot can be created by the following code:

    # create bar plot
    bp <- barplot(lizards.matrix,
                  beside=T,
                  col=col,
                  ylim=c(0,100),
                  xlab="Site",ylab="Frequency",
                  legend.text=row.names(lizards.matrix),
                  args.legend=list(x="top",bty="n"),
                  las=1,
                  cex.axis=1.2)
    
    # create matrix of errors
    lizards.error = matrix(c(10, 5, 12, 10, 8, 6, 12, 28, 3), ncol = 3)
    
    # add vertical part of error bars
    segments(bp, lizards.matrix - lizards.error, bp, lizards.matrix + lizards.error)
    
    # horizontal parts of error bars
    ew <- (bp[2,1]-bp[1,1])/4
    segments(bp - ew, lizards.matrix - lizards.error, bp + ew, lizards.matrix - lizards.error)
    segments(bp - ew, lizards.matrix + lizards.error, bp + ew, lizards.matrix + lizards.error)
    

    enter image description here

    The code works as follows:

    • I take advantage of the fact, that barplot() returns a matrix containing the horizontal coordinates of the bars. Therefore, I store the output of barplot() in the variable bp for later use. Note also, that I changed the range for ylim() to make sure that there is enough room in the plot for the error bars.

    • Then I define lizards.error which contains the error for each of the bars in the plot. Its structure follows that of lizards.matrix. So lizards.error[1, 1] contains the error for the bar with height `lizards.matrix[1, 1].

    • The error bars are then drawn using the function segments(). As many plot functions in base R, this function adds something to an existing plot. Its four relevant arguments are x0 y0, x1, y1 which define line segments that connect the point pairs defined by (x0, y0) and (x1, y1). If these arguments are vectors, each component of the vectors defines a point pair, such that the line segments connect the points (x0[i], y0[i]) and (x1[i], y1[i]) for all i.

    • segments() is now used to define each of the three segments that make up an error bar. First, the vertical part, where the horizontal coordinate is the same as for the error bars, such that bp can be used for this. The vertical coordinates are calculated from the height of the bar (lizards.matrix) and the size of the error (lizards.error).

    • The two horizontal lines of the error bars are produced similarly. Here, you also need to define the width of the lines, which is calculated from the distance between neighbouring bars. The horizontal coordinates of the bars are stored in bp and thus the distance between the bars (or, equivalently, the width of the bars) can be calculated from the difference between two neighbouring coordinates: bp[2,1]-bp[1,1]. (bp is a matrix and [i, j] gets the matrix element in the i-th row and j-th column.)

    EDIT: As rawr points out, a similar result can be obtained using one call of arrows() instead of three calls of segments():

    arrows(bp, lizards.matrix - lizards.error, bp, lizards.matrix + lizards.error,
           code = 3, angle = 90, length = 0.15)
    

    enter image description here

    • The range covered by the vertical line is given in exactly the same way as for segments().
    • code = 3 tells the function to draw arrows on both ends of the line.
    • angle is the angle between the shaft of the arrow and the lines that form the arrow head. An angle of 90 degrees leads to a horizontal line.

    This solution is obviously simpler since it replaces three function calls by one. The only disadvantage that I see is that the width of the error bars (the length argument) is given in inches, such that it may change when the size in which the plot is rendered is changed. In the case of segments(), the width of the error bars is given in terms of the horizontal coordinates.