Below is a Matlab code for Bayes classifier which classifies arbitrary numbers into their classes.
training = [3;5;17;19;24;27;31;38;45;48;52;56;66;69;73;78;84;88];
target_class = [0;0;10;10;20;20;30;30;40;40;50;50;60;60;70;70;80;80];
test = [1:2:90]';
class = classify(test,training, target_class, 'diaglinear'); % Naive Bayes classifier
[test class]
(a) If someone could provide code snippets for calculating the Bayes error for misclassification and accuracy. I went through matlab's documentation regarding [class,err]=classify(...)
. But, I am unable to follow it and work.
(b) Also, how to plot a scatter plot and histogram indicating the number of data points belonging to different classes? I tried out with scatter(training(:),target_class(:))
but it gives something else!
(c) How to work with crossvalidate()? An example would really help.Thank you.
(a) To calculate misclassification error you need to know test_class
as well. Then you can compare the output class
variable with test_class
.
misserr = sum(test_class~=class)./numel(test_class);
If you don't have the test classes the 2nd output argument err
will give you an estimate for misclassification error applying generated model on the training set.
(b) If you have just 2 factors (columns) in the training data set you can just do
scatter(training(:,1),training(:,2),[],target_class)
Correspondingly, you can use SCATTER3 for 3 factors.
For more factors you can perform Principal Component Analysis with PRINCOMP and plot 2 or 3 first components.
UPDATE: I missed that you actually have only one factor. Your scatter statement can work pretty well. Why don't you like it? You can also color the points differently adding target_class
as 4th argument. You can also exchange 1st and 2nd arguments for may be better representation.
(c) You can perform CV with CROSSVAL and CVPARTITION functions from Statistical Toolbox. See the documentation for useful examples.
Here is another SO question - How to use a cross validation test with MATLAB? with few additional options.