I have a large number of files which have the same target variables but a large numbers of input variables which vary from file to file. I would like to conduct classification and regression analysis on a new file without explicitly listing the input variables each time.
I am able to define a list of input variables within spss using spssinc select variables
by matching a regular expression within the variables names. For most tasks I would then run a loop using a macro so I do not need to explicitly list the variables. This however is not appropriate when conducting many classification and regression tasks as I am only looking at running the analysis once for a single target variable, and just need to define the list of input variables.
Below is an example dataset (much smaller than the datasets I am working with).
data list list/ID (A3) Sex (A1) Age (F2.0) Education (A5) Test_price01 Test_new01 Test_income01 Test_exp01 Test_01 Test_house01 Test_car01 Test_boat01 Test_var01 Test_var02 .
begin data
ID1 M 20 Prim 1 2 3 4 5 6 7 8 9 9
ID2 F 22 High 5 4 3 6 3 8 1 2 5 8
ID3 M 30 High 0 8 6 4 2 1 3 5 7 9
end data.
dataset name survey.
I would like to run a discriminant analysis which I could manually using the code below:
DATASET ACTIVATE survey.
DISCRIMINANT
/GROUPS=Age(20 30)
/VARIABLES=Test_price01 Test_new01 Test_income01 Test_exp01 Test_01 Test_house01 Test_car01
Test_boat01 Test_var01 Test_var02
/ANALYSIS ALL
/PRIORS EQUAL
/CLASSIFY=NONMISSING POOLED MEANSUB.
I have been able to define the input variables using spssinc select variables
, using the regular expression 'Test_'
spssinc select variables macroname="!Test_Vars" /properties pattern=".*Test_".
It would be great if I could somehoe use this list (or another approach) to dynamically updating my input variables for classification and regression tasks.
That is exactly what you use the macro name from spssinc select variables
for - you put it in the syntax instead of a list of variables.
So in your syntax it should look like this:
DISCRIMINANT
/GROUPS=Age(20 30)
/VARIABLES= !Test_Vars
/ANALYSIS ALL
/PRIORS EQUAL
/CLASSIFY=NONMISSING POOLED MEANSUB.