Search code examples
spss

Extract a list of variables satisfying certain conditions and storing it in a new variable using SPSS Syntax


I have around 300 variables and I am calculating their Skewness and Kurtosis. Now, I want to create a new varaible which will consist of the list of all those variables whose Skewness and Kurtosis are within a certain range. The idea is to select only those variables which are satisfying a condition and perform normalization on all the other variables.

To calcualte Skewness i am using;

Descriptives A TO Z
/Statistics Skewness.
Execute.

I know this is not a valid Syntax but i Need something like this:

Compute x= if(Skewness(A TO Z)>1)

Please help me out with an SPSS Syntax for this.


Solution

  • There are multiple ways to approach this, so there might be an easier way.

    you just need to change the 'var1 TO varN' to your list of variables and whatever criteria you want for Skewness & Kurtosis on the two COMPUTE lines that create the flags, and this will do it for you.

    If I were doing this I would go a step further and build the normalization into the syntax using WRITE OUT = ".sps" /CMD. INSERT FILE = ".sps", but that isn't what you asked for.

    DATASET DECLARE DistributionSyntax.
    OMS
      /SELECT TABLES
      /IF SUBTYPES=["Descriptives"] INSTANCES=[1]
      /DESTINATION FORMAT=SAV OUTFILE = 'DistributionSyntax'.
    EXAMINE VARIABLES=var1 TO varN
      /PLOT NONE
      /STATISTICS DESCRIPTIVES
      /CINTERVAL 95
      /MISSING PAIRWISE
      /NOTOTAL.
    OMSEND.
    DATASET ACTIVATE DistributionSyntax.
    
    USE ALL.
    FILTER OFF.
    SELECT IF ANY(Var2,'Skewness','Kurtosis').
    EXECUTE.
    STRING VarName (A64).
    COMPUTE SkewnessFlag = (Var2 = 'Skewness' AND ABS(Statistic) > 2).
    COMPUTE KurtosisFlag = (Var2 = 'Kurtosis' AND ABS(Statistic) > 2).
    COMPUTE VarName = CHAR.SUBSTR(Var1,1,CHAR.INDEX(Var1,' ')-1).
    EXECUTE.
    
    USE ALL.
    COMPUTE filter_$=(SkewnessFlag = 1).
    VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
    FORMATS filter_$ (f1.0).
    FILTER BY filter_$.
    EXECUTE.
    FRE VarName.
    
    USE ALL.
    COMPUTE filter_$=(KurtosisFlag= 1).
    VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
    FORMATS filter_$ (f1.0).
    FILTER BY filter_$.
    EXECUTE.
    FRE VarName.
    
    USE ALL. 
    FILTER OFF. 
    EXECUTE.
    

    If you omit the select data blocks after you compute the flags and replace it with this, it will calculate normalized versions of the variables that meet your criteria. This calculates new variables, and you will want to add a file location for the syntax file (replace the "~/" in the WRITE and INSERT commands), and change the name of the dataset referenced as 'RAWDATA' to whatever your dataset name is:

    USE ALL.
    FILTER OFF.
    SELECT IF ANY(1,SkewnessFlag,KurtosisFlag).
    EXECUTE.
    
    STRING CMD (A250).
    COMPUTE CMD = CONCAT("COMPUTE ",RTRIM(VarName),".Norm = ln(",RTRIM(VarName),").").
    EXECUTE.
    
    DATA LIST /CMD 1-250 (A).
    BEGIN DATA
    EXECUTE.
    END DATA.
    DATASET NAME EXE WINDOW = FRONT.
    
    DATASET ACTIVATE DistributionSyntax.
    ADD FILES /FILE = *
    /FILE = 'EXE'.
    EXECUTE.
    DATASET CLOSE EXE.
    DATASET ACTIVATE DistributionSyntax.
    
    WRITE OUT="~\Normalize Variables.sps" /CMD. 
    DATASET CLOSE DistributionSyntax.
    DATASET ACTIVATE RAWDATA.
    INSERT FILE="~\Normalize Variables.sps".