I need to run two loops through my regression, one of them being the independent variable and the other is a suffix for the prediction I need to save with each round of independent variables. I can do either of these loops separately and it works fine but not when I combine them in the same regression. I think this has something to do with the loop mapping at the end of my regression after the %. I get the error code "TypeError: list indices must be integers, not str." But, that is because my Dependent variables are read as strings to get the values from SPSS data frame. Any way to map a for loop in a regression that includes string variables?
I have tried using the map() function, but I got the code that the iteration is not supported.
begin program.
import spss,spssaux
dependent = ['dv1', 'dv2', 'dv3', 'dv4', 'dv5']
spssSyntax = ''
depList = spssaux.VariableDict(caseless = True).expand(dependent)
varSuffix = [1,2,3,4,5]
for dep in depList:
for var in varSuffix:
spssSyntax += '''
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT %(dep)s
/METHOD=FORWARD iv1 iv2 iv3
/SAVE PRED(PRE_%(var)d).
'''%(depList[dep],varSuffix[var])
end program.
I get the error code 'TypeError: list indices must be integers, not str' with the code above. How do I map the loop while also including a string?
In Python, when you loop directly through an iterable, the loop variable becomes the current value so there is no need to index original lists with depList[dep]
and varSuffix[var]
but use variables directly: dep
and var
.
Additionally, consider str.format
for string interpolation which is the Python 3 preferred method rather than the outmoded, de-emphasized (not yet deprecated) string modulo %
operator:
for dep in depList:
for var in varSuffix:
spssSyntax += '''REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT {0}
/METHOD=FORWARD iv1 iv2 iv3
/SAVE PRED(PRE_{1})
'''.format(dep, var)
Alternatively, consider combining the two lists for one loop using itertools.product
, then use a list comprehension to build string with join
instead of concatenating loop iterations with +=
:
from itertools import product
import spss,spssaux
dependent = ['dv1', 'dv2', 'dv3', 'dv4', 'dv5']
depList = spssaux.VariableDict(caseless = True).expand(dependent)
varSuffix = [1,2,3,4,5]
base_string = '''REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT {0}
/METHOD=FORWARD iv1 iv2 iv3
/SAVE PRED(PRE_{1})
'''
# LIST COMPREHENSION UNPACKING TUPLES TO FORMAT BASE STRING
# JOIN RESULTING LIST WITH LINE BREAKS SEPARATING ITEMS
spssSyntax = "\n".join([base_string.format(*dep_var)
for dep_var in product(depList, varSuffix)])
Now if you need to iterate in parallel elementwise between the equal length lists consider zip
instead of product
:
spssSyntax = "\n".join([base_string.format(d,v)
for d,v in zip(depList, varSuffix)])
Or enumerate
for index number:
spssSyntax = "\n".join([base_string.format(d,i+1)
for i,d in enumerate(depList)])