I have a data.frame that looks like this -
columnA=c(1,2,3,1.1,2.2,3.3,1,2)
columnB=c("a","b","c","d","e","f","g","h")
data=data.frame(columnA, columnB)
columnA columnB
1 1.0 a
2 2.0 b
3 3.0 c
4 1.1 d
5 2.2 e
6 3.3 f
7 1.0 g
8 2.0 h
I would like to find the duplicates in column A and replace them with the elements from the same row in column B. I want column C to be like this
columnA columnB columnC
1 1.0 a 1.0
2 2.0 b 2.0
3 3.0 c 3.0
4 1.1 d 1.1
5 2.2 e 2.2
6 3.3 f 3.3
7 1.0 g g
8 2.0 h h
where the duplicates 1.0 and 3.0 in rows 7 & 8 of column A have been replaced with the corresponding elements in rows 7 & 8 of column B [g and h]
Any help would be highly appreciated. Struggling for a long time with this.
Here is another option. Group by columnA and if we see the first occurrence of A then use A else use B.
library(tidyverse)
data <- tibble(columnA = c(1,2,3,1.1,2.2,3.3,1,2),
columnB =c("a","b","c","d","e","f","g","h"))
data %>%
group_by(columnA) %>%
mutate(columnC = ifelse(row_number() == 1, as.character(columnA), columnB))
#> # A tibble: 8 x 3
#> # Groups: columnA [6]
#> columnA columnB columnC
#> <dbl> <chr> <chr>
#> 1 1 a 1
#> 2 2 b 2
#> 3 3 c 3
#> 4 1.1 d 1.1
#> 5 2.2 e 2.2
#> 6 3.3 f 3.3
#> 7 1 g g
#> 8 2 h h