Search code examples
pythonpmml

Can I apply a PMML model that includes DefineFunction using Augustus (Python)?


I'm using Augustus as a PMML model consumer. I've modified the add two numbers example to include a DefineFunction element, like this:

<PMML version="4.1" xmlns="http://www.dmg.org/PMML-4_1">
    <Header/>
    <DataDictionary>
        <DataField name="x" dataType="double" optype="continuous"/>
        <DataField name="y" dataType="double" optype="continuous"/>
    </DataDictionary>
    <TransformationDictionary>
        <DefineFunction dataType="float" optype="continuous" name="add">
            <ParameterField optype="continuous" name="first"></ParameterField>
            <ParameterField optype="continuous" name="second"></ParameterField>
                <Apply function="+" invalidValueTreatment="returnInvalid">
                    <FieldRef field="first"></FieldRef>
                    <FieldRef field="second"></FieldRef>
                </Apply>
        </DefineFunction>
        <DerivedField name="z" dataType="double" optype="continuous">
            <Apply function="add">
                <FieldRef field="x"/>
                <FieldRef field="y"/>
            </Apply>
        </DerivedField>
    </TransformationDictionary>
</PMML>

I save this model in a file and try to run it like so:

from resources import add_two_numbers_file # this is just the path to my model file
from augustus.strict import modelLoader

# Load model
with open(add_two_numbers_file, 'r') as model_file:
    model_str = model_file.read()
    model = modelLoader.loadXml(model_str)

# Run model
print model.calc({'x':[1,2,3],'y':[4,5,6]}).look()

However, I get an error:

AttributeError: 'DefineFunction' object has no attribute '_setupCalculate'

I'm using the latest trunk (revision 794) and am able to run the unmodified example (without a DefineFunction) without a problem. Is DefineFunction supported by Augustus?


Solution

  • jcrudy, you are right: this was a bug. (An API changed and DefineFunction was not brought up-to-date.) It is now fixed in the public SVN repository: with Augustus >= r795, you can run your example as originally intended.

    By the way, your PMML is coming from an external file, yet you load it into a string and then into a PMML DOM. You can skip the intermediate step by just passing loadXML the file name:

    model = modelLoader.loadXml(add_two_numbers_file)
    

    (This could be relevant for very large PMML files; also note that they can be GZipped.)