My data is organized in the following way:
ID Year ID2012 var1 var2
1 2012 .
A 2014 1
A 2016 .
2 2012 .
B 2014 2
B 2016 .
For year 2014, I know the correspondence between ID2012 and ID2014, and I know that ID2016 is coded in the same way as ID2014. Now I want to make ID across the years consistent to get a panel: either A is replaced by 1 or the other way round.
ID Year ID2012 var1 var2
A 2012 .
A 2014 1
A 2016 .
B 2012 .
B 2014 2
B 2016 .
or
ID Year ID2012 var1 var2
1 2012 .
1 2014 1
1 2016 .
2 2012 .
2 2014 2
2 2016 .
There is no arithmetic rule between ID2012 and ID2014 that can be utilized.
Here's one way to do it:
# Read in the data
df <- read.table(text = "ID Year ID2012 var1
1 2012 . 9.1
A 2014 1 2.2
A 2016 . 1.0
2 2012 . 4.8
B 2014 2 4.7
B 2016 . 6.1",
header = T, stringsAsFactors = F)
# Find the positions where ID2012 tells us what ID to replace it with
ID.map <- match(df$ID, df$ID2012) # [1] 2 NA NA 5 NA NA
# Replace values in the ID column where they are defined in the map
df$ID[!is.na(ID.map)] <- df[ID.map[!is.na(ID.map)], 'ID']
# Result:
# ID Year ID2012 var1
# 1 A 2012 . 9.1
# 2 A 2014 1 2.2
# 3 A 2016 . 1.0
# 4 B 2012 . 4.8
# 5 B 2014 2 4.7
# 6 B 2016 . 6.1