I have following two data sets.
The first one from children looks like this.
ID year Q1 Q2 Q3 Q4 ....
101 2014 1 2 2 2
101 2016 1 2 2 2
101 2017 1 2 2 2
101 2018 1 2 2 2
401 2014 1 2 2 2
401 2015 1 2 3 3
401 2016 1 2 2 2
401 2017 1 2 1 1
401 2018 1 2 2 2
402 2014 1 1 0 3
402 2015 1 1 2 2
402 2016 1 1 2 2
402 2017 1 1 3 3
402 2018 1 1 2 3
Here's the second one from their parents.
ID year Q101 Q102
1 2014 1 3
1 2015 1 3
1 2016 1 3
1 2017 1 2
1 2018 1 2
2 2014 2 .
2 2015 1 2
2 2016 . .
2 2017 1 3
2 2018 2 .
4 2014 1 3
4 2015 1 3
4 2016 1 3
4 2017 1 3
4 2018 1 3
So the parent data ID can be matched to the children data ID deleted last two digits. It seems that parent ID 4 has two children.
I tried merge 1:m ID using kids data as the master data set. but it didn't work.
Thank you.
Getting good answers is made more likely by (a) attempting code and showing what you tried and (b) giving data in the form of code anybody using Stata can run. The code here follows from editing your post and is close to what you could get directly by using dataex
as explained in the Stata tag wiki or indeed at help dataex
in an up-to-date Stata or one in which you installed dataex
from SSC.
clear
input ID year Q1 Q2 Q3 Q4
101 2014 1 2 2 2
101 2016 1 2 2 2
101 2017 1 2 2 2
101 2018 1 2 2 2
401 2014 1 2 2 2
401 2015 1 2 3 3
401 2016 1 2 2 2
401 2017 1 2 1 1
401 2018 1 2 2 2
402 2014 1 1 0 3
402 2015 1 1 2 2
402 2016 1 1 2 2
402 2017 1 1 3 3
402 2018 1 1 2 3
end
gen IDP = floor(ID/100)
save children
clear
input ID year Q101 Q102
1 2014 1 3
1 2015 1 3
1 2016 1 3
1 2017 1 2
1 2018 1 2
2 2014 2 .
2 2015 1 2
2 2016 . .
2 2017 1 3
2 2018 2 .
4 2014 1 3
4 2015 1 3
4 2016 1 3
4 2017 1 3
4 2018 1 3
end
rename ID IDP
merge 1:m IDP year using children
list
+----------------------------------------------------------------------+
| IDP year Q101 Q102 ID Q1 Q2 Q3 Q4 _merge |
|----------------------------------------------------------------------|
1. | 1 2014 1 3 101 1 2 2 2 matched (3) |
2. | 1 2015 1 3 . . . . . master only (1) |
3. | 1 2016 1 3 101 1 2 2 2 matched (3) |
4. | 1 2017 1 2 101 1 2 2 2 matched (3) |
5. | 1 2018 1 2 101 1 2 2 2 matched (3) |
|----------------------------------------------------------------------|
6. | 2 2014 2 . . . . . . master only (1) |
7. | 2 2015 1 2 . . . . . master only (1) |
8. | 2 2016 . . . . . . . master only (1) |
9. | 2 2017 1 3 . . . . . master only (1) |
10. | 2 2018 2 . . . . . . master only (1) |
|----------------------------------------------------------------------|
11. | 4 2014 1 3 401 1 2 2 2 matched (3) |
12. | 4 2015 1 3 401 1 2 3 3 matched (3) |
13. | 4 2016 1 3 402 1 1 2 2 matched (3) |
14. | 4 2017 1 3 401 1 2 1 1 matched (3) |
15. | 4 2018 1 3 402 1 1 2 3 matched (3) |
|----------------------------------------------------------------------|
16. | 4 2014 1 3 402 1 1 0 3 matched (3) |
17. | 4 2015 1 3 402 1 1 2 2 matched (3) |
18. | 4 2016 1 3 401 1 2 2 2 matched (3) |
19. | 4 2017 1 3 402 1 1 3 3 matched (3) |
20. | 4 2018 1 3 401 1 2 2 2 matched (3) |
+----------------------------------------------------------------------+
As far as the merge
is concerned the essentials are identifiers with the same name(s) in both datasets and the correct pattern for merging. The parent identifier is only implied by the children dataset.