Search code examples
rr-factor

Is there an elementary way to replace R's integer encoding of levels with labels?


This is my first question here, so I'm hoping that it's suitable for this forum. Any suggestions on how to improve the question or title would be very much appreciated.

Given

> experiment <- data.frame(old=factor(c("z","z","z","z","z"),levels=c("x","y","z")),
new=factor(c("y","z","x",NA,NA),levels=c("x","y","z")))
> experiment
  old  new
1   z    y
2   z    z
3   z    x
4   z <NA>
5   z <NA>

I would like to update the old with the new exactly when new is not NA. The command

> experiment$old <- ifelse(is.na(experiment$new),experiment$old,experiment$new)

seems to be what I want, except I am getting R's integer encoding of levels rather than the labels themselves:

> experiment
  old  new
1   2    y
2   3    z
3   1    x
4   3 <NA>
5   3 <NA>

Is there some elementary way to translate R's integer encoding of levels back into labels? I was hoping to get

> experiment
  old  new
1   y    y
2   z    z
3   x    x
4   z <NA>
5   z <NA>

as output instead.

Thank you very much.


Solution

  • This uses the integer values as an index into `levels(experiment$old):

    > experiment$old <- levels(experiment$old)[
                           ifelse(is.na(experiment$new),experiment$old,experiment$new)] 
    > experiment
      old  new
    1   y    y
    2   z    z
    3   x    x
    4   z <NA>
    5   z <NA>