Search code examples
rr-caretpmml

obtain pmml representation of glm-type model produced by caret::train


I am trying to produce PMML from a regression model trained in caret with method='glm'. Example model:

library('caret')

data('GermanCredit')

set.seed(123)

train_rows <- createDataPartition(GermanCredit$Class, p=0.6, list=FALSE)

train_x <- GermanCredit[train_rows, c('Age','ForeignWorker','Housing.Own',
                                      'Property.RealEstate','CreditHistory.Critical') ]
train_y <- as.integer( GermanCredit[train_rows, 'Class'] == 'Good' )

some_glm <- train( train_x, train_y, method='glm', family='binomial', 
                   trControl = trainControl(method='none') )

summary(some_glm$finalModel)

An unaccepted answer on this related question for type='rf' suggests that it is not possible to do using the matrix interface.

So I'm unable to get pmml using either the matrix or the formula syntax (which I'm pretty sure produce identical finalModels anyway):

library('pmml')

pmml(some_glm$finalModel) 
# Error in if (model$call[[1]] == "glm") { : argument is of length zero

# Same problem if I try:
some_glm2 <- train( Class ~ Age + ForeignWorker + Housing.Own + 
                      Property.RealEstate + CreditHistory.Critical, 
                    data=GermanCredit[train_rows, ], family="binomial", 
                    method='glm',
                    trControl = trainControl(method='none') )
pmml(some_glm2$finalModel)

It does work in base glm with the formula interface:

some_glm_base <- glm(Class ~ Age + ForeignWorker + Housing.Own + 
                     Property.RealEstate + CreditHistory.Critical, 
                     data=GermanCredit[train_rows, ], family="binomial")
pmml(some_glm_base) # works

For interoperablity, I would like to continue to use caret. Is there a way to convert some_glm produced in caret back to a format that pmml() will accept? Or am I forced to use the glm() construction if I want pmml functionality?


Solution

  • If you set model$call[[1]], the pmml function will work correctly.

    So in your case you would want to:

    library('pmml')
    
    some_glm$finalModel$call[[1]] <- "glm"
    pmml(some_glm$finalModel)