I am using mark-recapture data, with each individual having a PIT tag (Tag
) and a genetic sample ID given at the first observation (ID
), which is typically used as the ID for the individual (Unified.ID
). However, sometimes an animal will lose a PIT tag, and we have to retag it, but if we know its old tag number we record that as Alt_tag
.
This sample data is four observations from the same individual, so I am trying to get them all to have the same Unified.ID
by matching the Alt_tag
in row 4 to the Tag
in rows 1:3, so that all 4 rows end up with "CAL163037" in the Unified.ID
column.
sample.dat<-tibble(ID=c("CAL163037",NA,NA,NA),
Tag = c("132800","132800","132800","981930"),
Date = c("2016-08-23","2017-06-09","2017-06-22","2017-08-23"),
Alt_tag = c(NA,NA,NA,"132800"),
Unified.ID = c("CAL163037","CAL163037","CAL163037","981930"))
# A tibble: 4 × 5
ID Tag Date Alt_tag Unified.ID
<chr> <chr> <chr> <chr> <chr>
1 CAL163037 132800 2016-08-23 NA CAL163037
2 NA 132800 2017-06-09 NA CAL163037
3 NA 132800 2017-06-22 NA CAL163037
4 NA 981930 2017-08-23 132800 981930
I think I sort of have something close with the code below, but I feel like there is a more elegant way to do this, preferably with some kind of dplyr mutate, so it is easier to check that it is doing what I want it to (I can't get this if_else function working inside of mutate()). My full dataset is pretty big, so I want to be confident that it is working properly across the whole thing. Any ideas of how to clean this up?
> if_else(is.na(sample.dat$Alt_tag)==FALSE,(sample.dat[
which(sample.dat$Tag%in%sample.dat$Alt_tag),"Unified.ID"]%>%distinct()),NA)
# A tibble: 4 × 1
# Groups: Unified.ID [1]
Unified.ID
<fct>
1 CAL163037
2 CAL163037
3 CAL163037
4 CAL163037
Probably you can try this?
sample.dat %>%
mutate(Unified.Tag = coalesce(Alt_tag, Tag)) %>%
mutate(Unified.ID = unique(na.omit(ID)), .by = Unified.Tag) %>%
select(-Unified.Tag)
which gives
# A tibble: 4 × 5
ID Tag Date Alt_tag Unified.ID
<chr> <chr> <chr> <chr> <chr>
1 CAL163037 132800 2016-08-23 NA CAL163037
2 NA 132800 2017-06-09 NA CAL163037
3 NA 132800 2017-06-22 NA CAL163037
4 NA 981930 2017-08-23 132800 CAL163037