Search code examples
spss

How to refactor string containing variable names into booleans?


I have an SPSS variable containing lines like:

|2|3|4|5|6|7|8|10|11|12|13|14|15|16|18|20|21|22|23|24|25|26|27|28|29|

Every line starts with pipe, and ends with one. I need to refactor it into boolean variables as the following:

var       var1  var2  var3  var4  var5
|2|4|5|   0     1     0     1     1

I have tried to do it with a loop like:

loop # = 1 to 72.
compute var# = SUBSTR(var,2#,1).
end loop.
exe.

My code won't work with 2 or more digits long numbers and also it won't place the values into their respective variables, so I've tried nest the char.substr(var,char.rindex(var,'|') + 1) into another loop with no luck because it still won't allow me to recognize the variable number.

How can I do it?


Solution

  • This looks like a nice job for the DO REPEAT command. However the type conversion is somewhat tricky:

    DO REPEAT var#i=var1 TO var72
             /i=1 TO 72.
    COMPUTE var#i = CHAR.INDEX(var,CONCAT("|",LTRIM(STRING(i,F2.0)),"|"))>0).
    END REPEAT.
    

    Explanation: Let's go from the inside to the outside:

    • STRING(value,F2.0) converts the numeric values into a string of two digits (with a leading white space where the number consist of just one digit), e.g. 2 -> " 2".
    • LTRIM() removes the leading whitespaces, e.g. " 2" -> "2".
    • CONCAT() concatenates strings. In the above code it adds the "|" before and after the number, e.g. "2" -> "|2|"
    • CHAR.INDEX(stringvar,searchstring) returns the position at which the searchstring was found. It returns 0 if the searchstring wasn't found.
    • CHAR.INDEX(stringvar,searchstring)>0 returns a boolean value indicating if the searchstring was found or not.