usually I use [R] for my data analysis, but these days I have to use SPSS. I was expecting that data manipulation might get a little bit more difficult this way, but after my first day I kind of surrender :D and I really would appreciate some help ... My problem is the following:
Do you know if there is (kind of) a simple solution to my problem?
Example:
dataset A:
ID | VAL1 |
---|---|
1 | A |
1 | B |
2 | D |
3 | K |
4 | A |
dataset B:
ID | VAL2 |
---|---|
1 | g |
2 | k |
4 | a |
5 | c |
5 | d |
5 | a |
2 | x |
expected result (best solution):
ID | VAL1 | VAL2 |
---|---|---|
1 | A | g |
1 | B | g |
2 | D | k |
3 | K | NA |
4 | A | a |
2 | D | x |
expected result (second best solution):
ID | VAL1 | VAL2 |
---|---|---|
1 | A | g |
2 | D | k |
3 | K | NA |
4 | A | a |
5 | NA | c |
5 | NA | d |
5 | NA | a |
2 | D | x |
what I get (worst solution):
ID | VAL1 | VAL2 |
---|---|---|
1 | A | g |
2 | D | k |
4 | A | a |
5 | NA | c |
5 | NA | d |
5 | NA | a |
2 | D | x |
From your example It looks like what you need is a full many to many join, based on the ID's existing in dataset A. You can get this by creating a full Cartesian-Product of the two dataset, using dataset A as the first\left dataset.
The following syntax assumes you have the STATS CARTPROD extention command installed. If you don't you can see here about installing it.
First I'll recreate your example to demonstrate on:
dataset close all.
data list list/id1 vl1 (2F3) .
begin data
1 232
1 433
2 456
3 246
4 468
end data.
dataset name aaa.
data list list/id2 vl2 (2F3) .
begin data
1 111
2 222
4 333
5 444
5 555
5 666
2 777
3 888
end data.
dataset name bbb.
Now the actual work is fairly simple:
DATASET ACTIVATE aaa.
STATS CARTPROD VAR1=id1 vl1 INPUT2=bbb VAR2=id2 vl2
/SAVE OUTFILE="C:\somepath\yourcartesianproduct.sav".
* The new dataset now contains all possible combinations of rows in the two datasets.
* we will select only the relevant combinations, where the two ID's match.
select if id1=id2.
exe.