Search code examples
rplottree

Error encountered when trying to plot individual trees from cforest() forest using the partykit package in R


I am doing conditional inference tree analysis using the partykit package in R. I want to plot any tree that is extracted from the forest grown by cforest(). But I got an error message when I am trying with the plot function. The following is a chunk of codes that may produce the like error message with the iris data.

Code:

library(partykit)

cf <- cforest(Species ~., data=iris)
tr <- gettree(cf, tree=1)
plot(tr)

Errors in console:

Error in Summary.factor(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, : ‘min’ not meaningful for factors

I am expecting plots for individual trees in the cforest() result.


Solution

  • I have got an answer after researching into it.

    Since plot works well with ctree() objects, I compared the extracted tree from cforest and the tree generated by ctree() and found the following difference in their data structure. For ctree object, which can be plotted:

    $ fitted:'data.frame':  150 obs. of  3 variables:
      ..$ (fitted)  : int [1:150] 2 2 2 2 2 2 2 2 2 2 ...
      ..$ (weights) : num [1:150] 1 1 1 1 1 1 1 1 1 1 ...
      ..$ (response): Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
    

    but for a tree from cforest() result, which cannot be plotted:

    $ fitted:'data.frame':  150 obs. of  4 variables:
      ..$ idx       : int [1:150] 1 2 3 4 5 6 7 8 9 10 ...
      ..$ (response): Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
      ..$ (weights) : int [1:150] 0 1 1 1 1 1 1 1 0 0 ...
      ..$ (fitted)  : int [1:150] 2 2 2 2 2 2 2 2 2 2 ...
    

    Please note that the variables (response), (weights) and (fitted) are in different columns in the data structure of "fitted" dataframe in these two trees.

    Therefore, I use the following manipulation to adjust the "fitted" dataframe structure in the cforest tree object, and plotting is successful:

    library(partykit)
    cf <- cforest(Species ~., data=iris)
    tr <- gettree(cf, tree=1)
    nfitted <- data.frame(tr$fitted$`(fitted)`,tr$fitted$`(weights)`,tr$fitted$`(response)`)
    colnames(nfitted) <- c('(fitted)', '(weights)', '(response)')
    tr$fitted <- nfitted
    plot(tr)
    

    Hope this will help those who encounter the same problem with the plotting of trees from cforest() in the partykit package.