Search code examples
pythonsqloracle-databaseloopsspss-modeler

SPSS Modeler using python for doing loop through months from SQL code


I have a Node in SPSS Modeler with SQL code provided below. It's selecting a month and calculating a count for one month. I created a parameter '$P-p_ly_parameter' and assigned a value 201807 to it.

What I want to do is to run a loop through the months from 201807 to 201907.

I use a Python code putting it into Tools, Stream Properties, Execution.

But when I run it won't let me to get the results that I expect. In fact I'm not getting any results.

Obviously, I'm missing something. I suppose the result of the loop is not assigning to the month_id for each month.

Could you please help me with the way to do the loop in the right way? Should I use Select node and include something like this?

-- SQL
SELECT 
cust.month_id, 
count(*) as AB_P1_TOTAL

FROM tab1 cust
JOIN tab2 dcust ON dcust.month_id=cust.month_id and 
dcust.cust_srcid=cust.cust_srcid
WHERE   
cust.month_id ='$P-p_ly_parameter'
group by cust.month_id
order by cust.month_id

# Python

import modeler.api

# boilerplate definitions
stream = modeler.script.stream()
taskrunner = modeler.script.session().getTaskRunner()

# variables for starting year
startYear = 2018
# gets us to 2019
yearsToLoop = 1

# get the required node by Id
# double click on node, go to annotations and get ID from bottom right 
selectNode = stream.findByID('id5NBVZYS3XT2')
runNode = stream.findByID('id3N3V6JXBQU2')

# loop through our years
for year in range(0, yearsToLoop):
    # loop through months
    for month in range(1,13):
        #month_id = str(startYear + year) + str(month).rjust(2,'0')#ar
        p_ly_parameter = str(startYear + year) + str(month).rjust(2,'0')#ar
        #debug
        #print month_id
        print p_ly_parameter
        # set the condition in the select node
        #selectNode.setPropertyValue('condition', 'month_id = ' + month_id)
        #selectNode.setPropertyValue("condition", "'month_id = '$P-p_ly_parameter'")
        #selectNode.setPropertyValue('mode', 'Include')

        # run the stream
        runNode.run(None)

I expect the results by month for example 201807 500, 201808 1000 etc. But now I'm getting nothing


Solution

  • The missing piece is to set the value of the stream parameter. The line of code that says:

    p_ly_parameter = str(startYear + year) + str(month).rjust(2,'0')
    

    only sets the value of a variable in the Python script itself, but does not change the value of the stream parameter with the same name.

    You need to add a line immediately following that, which explicitly sets the value of the stream parameter such as:

    stream.setParameterValue("p_ly_parameter", p_ly_parameter)