Search code examples
pythonrrpy2performance-estimation

How can I generate CDdiagrams using Python/rpy2


I want to generate a graph similar to this:

CDdiagram Nemesy Test

I know this API can generate a very similar diagram, but it doesn't use a matrix or vectors, but an object from the same R package.

https://cran.r-project.org/web/packages/performanceEstimation/performanceEstimation.pdf

## Not run:
## Estimating MSE for 3 variants of both
## regression trees and SVMs, on two data sets, using one repetition
## of 10-fold CV
library(e1071)
data(iris)
data(Satellite,package="mlbench")
data(LetterRecognition,package="mlbench")
## running the estimation experiment
res <- performanceEstimation(
        c(PredTask(Species ~ .,iris),
            PredTask(classes ~ .,Satellite,"sat"),
            PredTask(lettr ~ .,LetterRecognition,"letter")),
            workflowVariants(learner="svm",
            learner.pars=list(cost=1:4,gamma=c(0.1,0.01))),
        EstimationTask(metrics=c("err","acc"),method=CV()))
## checking the top performers
topPerformers(res)
## now let us assume that we will choose "svm.v2" as our baseline
## carry out the paired comparisons
pres <- pairedComparisons(res,"svm.v2")
## obtaining a CD diagram comparing all workflows against
## the baseline (defined in the previous call to pairedComparisons)
CDdiagram.BD(pres,metric="err")
## OR this for the Nemenyi test
CDdiagram.Nemenyi(pres,metric="err")

Solution

  • Using Orange (http://docs.orange.biolab.si/3/modules/evaluation.cd.html)

    import Orange, orngStat
    
    names = lbs     # labels of each technique
    avranks = means # average ranking of each technique
    number_of_datasets = 30 # number of datasets
    
    # alpha = '0.1', '0.05' or '0.01'
    cd = orngStat.compute_CD(avranks, number_of_datasets, alpha='0.1')
    orngStat.graph_ranks("output.png", avranks, names, \
        cd=cd, width=6, textspace=1.5)
    print cd
    plt.show()