Say a data set:
a <- c(101,101,102,102,103,103)
b <- c("M","M","P","P","M","M")
dt <- as.data.frame(cbind(a,b))
dt
a b
1 101 M
2 101 M
3 102 P
4 102 P
5 103 M
6 103 M
Column a is subject_ID, and column b is subject_name. I want to uniquely rename subject ID 101 to M1, and 103 to M2.
Is there a way to do this by indexing?
This does not work.
dt.try1 <- gsub("M","M1",dt[1:2,c(2)])
dt.try1
[1] "M1" "M1"
This is what would be ideal result:
a b
1 101 M
2 101 M
3 102 P
4 102 P
5 103 M2
6 103 M2
Why does not this work?
Sample data.
a <- c(101,101,102,102,103,103)
b <- c("M","M","P","P","M","M")
dt <- data.frame(a, b)
FYI, never use data.frame(cbind(..))
to create a frame: in this case, since at least one of the vectors is character
, they will all be character
since cbind
by default creates matrices (which are limited to one class, unlike frames). It's always better here to use data.frame(..)
directly.
Note: for clarity, your "ideal output" shows M,M,P,P,M2,M2
, but your previous code block trying to change the first two to M1
. I'm basing my code on the assumption that you need the first two to be M1
instead of just M
. (For that, akrun's answer is correct, though this metholodogy could be adjusted.)
library(dplyr)
dt %>%
distinct(a, b) %>%
group_by(b) %>%
mutate(b = if (n() > 1) paste0(b, row_number()) else b) %>%
left_join(dt, ., by = "a", suffix = c(".x", "")) %>%
select(-b.x)
# a b
# 1 101 M1
# 2 101 M1
# 3 102 P
# 4 102 P
# 5 103 M2
# 6 103 M2
dt2 <- unique(dt[, c("a", "b")])
dt2$b <- ave(dt2$b, dt2$b, FUN = function(z) if (length(z) > 1) paste0(z, seq_along(z)) else z)
dt2
# a b
# 1 101 M1
# 3 102 P
# 5 103 M2
merge(subset(dt, select = -b), dt2, by = "a")
# a b
# 1 101 M1
# 2 101 M1
# 3 102 P
# 4 102 P
# 5 103 M2
# 6 103 M2