Search code examples
macrosspss

Selecting variables in a dataset according to values in another dataset


I want to create a subset for a dataset which has around 100 variables and I wish to KEEP only those variables that are present as values of another variable in another dataset. Can someone pleae help me with a SPSS Syntax.

This is what it should look like:

DATASET ACTIVATE basedataset.
SAVE OUTFILE ='Newdata.sav'
/KEEP Var1.

Var 1 is the variable in the other dataset which contains all the values based on which i want to perform the subsetting.I am not sure if vector should be involved or if there is an easier way to do this.


Solution

  • The following will create a macro containing the list of variables you require, to use in your analysis or in subsetting the data.

    First I'll create some sample data to demonstrate on:

    data list free /v1 to v10 (10f3).
    begin data
    1,2,3,2,4,7,77,777,66,55
    end data.
    dataset name basedataset.
    
    data list free/var1 (a4).
    begin data
    "v3", "v5", "v6", "v9"
    end data.
    dataset name varnames.
    

    Now to create the list:

    dataset activate varnames.
    write out="yourpath\var1 selection.sps" 
        /"VARIABLE ATTRIBUTE VARIABLES= ", var1, " ATTRIBUTE=selectVars('yes')." .
    exe.
    
    dataset activate basedataset.
    VARIABLE ATTRIBUTE VARIABLES=all  ATTRIBUTE=selectVars('no').
    insert file="yourpath\var1 selection.sps".
    SPSSINC SELECT VARIABLES MACRONAME="!varlist" /ATTRVALUES NAME=selectVars VALUE = yes .
    

    The list is now ready and can be called using the macro name !varlist in any command, for example:

    freq !varlist.
    

    or

    SAVE OUTFILE ='Newdata.sav' /KEEP !varlist.