Search code examples
pythonorange

How to set up and use sample weight in the Orange python package?


I am new to the Orange python package for data mining. I am using Orange 2.7.

My dataset has a binary target (Good and Bad). The Good instances are down sampled with a sampling weight of 10. How can I set up and use the weight for classfication analysis in both Windows and Linux versions of Orange? Thank you for your help!


Solution

  • You have to add a new meta column to your data, containing the instance weights (see Meta attributes and Table.add_meta_attribute. Store the meta column's id and call the learner with that meta id.

    import Orange
    iris = Orange.data.Table("iris")
    # Add some weights to the iris dataset
    weight = Orange.feature.Continuous("weight")
    weight_id = -10
    iris.domain.add_meta(weight_id, weight)
    iris.add_meta_attribute(weight, 1.0)
    for i in range(50, 150):
         iris[i][weight] = 10
    
    # Train a tree classifier on weighted data.
    clsf = Orange.classification.tree.TreeLearner(iris, weight_id)
    
    # Evaluate learner performance on weighted data
    results = Orange.evaluation.testing.cross_validation(
        [Orange.classification.tree.TreeLearner,
         Orange.classification.bayes.NaiveLearner],
        (iris, weight_id)  # Note how you pass the weight id to testing functions
    )
    auc = Orange.evaluation.scoring.AUC(results)
    ca = Orange.evaluation.scoring.CA(results)