Search code examples
spss

Destring many, many variables, checking for non-numeric values


Let's say I've imported data from excel that has many, many variables, say v1 through v4000. Each of these is intended to be numeric, and most cases have numeric-only values, but there are some cases that have non-numeric characters. For some of those non-numerics, I know the meaning (e.g., "NA" for missing), and potentially some unknown strings that should be investigated.

For each variable, I think I would like to do something like 1) create a numeric version of that variable that has the original values for all cases that had numeric values, 2) create a list of unique string values for cases with non-numerics so those can be investigated. With 4,000 variables, I would ideally use some type of loop to do this.

How can that be done? Is it even possible?


Solution

  • I was able to solve this using the below macro, which creates a new variable with a "_str" suffix that holds the original values, and which can therefore be used to report frequencies of values that were turned into system missing values.

       DEFINE destringvars(names=!cmdend)
        !do !i !in (!names)
            RENAME VARIABLES (!i=!concat(!i,"_str")).
            STRING !i (A9).
            compute !i=!concat(!i,"_str").
            alter type !i(f8).
            TEMPORARY.
            SELECT if SYSMIS(!i).
            FREQUENCY !concat(!i,"_str").
        !DOEND        
        EXECUTE.
    !enddefine