Search code examples
pythonmacrosspss

Selecting variables in a file according to list held in another file


I have a SPSS working file with 3000+ variables and only need to keep 500. I also have an excel file with the names of the variables to keep. Is there a way to read in this variable list and then delete whatever is not in this list? I'm trying to avoid manually doing this because it is something I may have to do in the future again.


Solution

  • If your variable list will not change (so you will need to use it again but with the same variable names) just create a new syntax file, copy your variable list from excel into the following code, save the syntax and run it whenever you need (while your data is open):

    add files /file=* /keep= var1 var2 var3 ...... (your entire variable list) .
    

    On the other hand, if the list is going to be changed from time to time, you can automate the creation of the above syntax.
    While your original data is open:

    dataset name orig. /* use this if your original data is open.
    GET DATA /TYPE=XLSX /FILE='path\your file with the variable names.xlsx' /SHEET=name'whatever'.
    compute placeholder=1.
    flip /newnames = YourColumnName .
    delete variables CASE_LBL.
    spssinc select variables macroname="!myvarlist" .
    

    Please change the code to get your file with the variable names, then put the actual name of the column holding the variable names.

    At this point you have the full variable list in a macro named !myvarlist which you can use in your syntax, among other things for selecting those variables:

    dataset activate orig. /* or just open the original data.
    add files /file=* /keep= !myvarlist .