Search code examples
rr-faq

What's the biggest R-gotcha you've run across?


Is there a certain R-gotcha that had you really surprised one day? I think we'd all gain from sharing these.

Here's mine: in list indexing, my.list[[1]] is not my.list[1]. Learned this in the early days of R.


Solution

  • Removing rows in a dataframe will cause non-uniquely named rows to be added, which then errors out:

    > a<-data.frame(c(1,2,3,4),c(4,3,2,1))
    > a<-a[-3,]
    > a
      c.1..2..3..4. c.4..3..2..1.
    1             1             4
    2             2             3
    4             4             1
    > a[4,1]<-1
    > a
    Error in data.frame(c.1..2..3..4. = c("1", "2", "4", "1"), c.4..3..2..1. = c(" 4",  : 
      duplicate row.names: 4
    

    So what is going on here is:

    1. A four row data.frame is created, so the rownames are c(1,2,3,4)

    2. The third row is deleted, so the rownames are c(1,2,4)

    3. A fourth row is added, and R automatically sets the row name equal to the index i.e. 4, so the row names are c(1,2,4,4). This is illegal because row names should be unique. I don't see why this type of behavior should be allowed by R. It seems to me that R should provide a unique row name.