I have a data-file with 13 attributes and binary class variable. In Orange Canvas, when i apply the 'Naive Bayes classifier' and then check the performance with the 'Test Learner' i find that the results depend on the order in which the attributes are selected in de 'Select Attributes' widget. The difference is not large, for example the accuracy goes from 0.78 to 0.76.
As the Naive Bayes algorithm consists of multiplying estimated probabilities, the order of the terms should not matter. Closer examination revealed:
The call looks like this:
bayes_rl = Orange.classification.bayes.NaiveLearner(estimator_constructor=Orange.statistics.estimate.RelativeFrequency())
bayes_relative = bayes_rl(data)
print bayes_relative.conditional_distributions
Of course, i am assuming here that calling the classifier from the command line is equivalent to selecting the attributes visually in the same order as they appear in the file.
This makes me a bit insecure as to what is going on, is it some kind of rounding error?
The order of attributes does matter due to the limited precision of machine floating point number representation, in particular when multiplying small (near zero) numbers. This is probably the cause this behavior (the Naive Bayes in Orange uses 32 bit floating point precision).