Search code examples
rindexingrenamegsub

How to rename identical values in a column within R?


Say a data set:

a <- c(101,101,102,102,103,103)
b <- c("M","M","P","P","M","M")
dt <- as.data.frame(cbind(a,b))
dt

    a b
1 101 M
2 101 M
3 102 P
4 102 P
5 103 M
6 103 M

Column a is subject_ID, and column b is subject_name. I want to uniquely rename subject ID 101 to M1, and 103 to M2.

Is there a way to do this by indexing?

This does not work.

dt.try1 <- gsub("M","M1",dt[1:2,c(2)])
dt.try1
[1] "M1" "M1"

This is what would be ideal result:

    a  b
1 101  M
2 101  M
3 102  P
4 102  P
5 103 M2
6 103 M2

Why does not this work?


Solution

  • Sample data.

    a <- c(101,101,102,102,103,103)
    b <- c("M","M","P","P","M","M")
    dt <- data.frame(a, b)
    

    FYI, never use data.frame(cbind(..)) to create a frame: in this case, since at least one of the vectors is character, they will all be character since cbind by default creates matrices (which are limited to one class, unlike frames). It's always better here to use data.frame(..) directly.

    Note: for clarity, your "ideal output" shows M,M,P,P,M2,M2, but your previous code block trying to change the first two to M1. I'm basing my code on the assumption that you need the first two to be M1 instead of just M. (For that, akrun's answer is correct, though this metholodogy could be adjusted.)

    dplyr

    library(dplyr)
    dt %>%
      distinct(a, b) %>%
      group_by(b) %>%
      mutate(b = if (n() > 1) paste0(b, row_number()) else b) %>%
      left_join(dt, ., by = "a", suffix = c(".x", "")) %>%
      select(-b.x)
    #     a  b
    # 1 101 M1
    # 2 101 M1
    # 3 102  P
    # 4 102  P
    # 5 103 M2
    # 6 103 M2
    

    base R

    dt2 <- unique(dt[, c("a", "b")])
    dt2$b <- ave(dt2$b, dt2$b, FUN = function(z) if (length(z) > 1) paste0(z, seq_along(z)) else z)
    dt2
    #     a  b
    # 1 101 M1
    # 3 102  P
    # 5 103 M2
    merge(subset(dt, select = -b), dt2, by = "a")
    #     a  b
    # 1 101 M1
    # 2 101 M1
    # 3 102  P
    # 4 102  P
    # 5 103 M2
    # 6 103 M2