I have a large dataset of multidimensional data (240 dimensions).
I am a beginner at performing data mining and I want to apply Linear Discriminant Analysis by using MATLAB. However, I have seen that there are a lot of functions explained on the web but I do not understand how should they be applied.
Basically, I want to apply LDA.
After this step I want to be able to do a reconstruction for my data.
I can do this manually, but I was wondering if there are any predefined functions which can do this because they should already be optimized.
My initial data is something like: size(x) = [2000 240]
. So basically I have 240 features (dimensions) and 2000 data points. And I want to perform LDA on this data set.
The function classify
from Statistics Toolbox does Linear (and, if you set some options, Quadratic) Discriminant Analysis. There are a couple of worked examples in the documentation that explain how it should be used: type doc classify
or showdemo classdemo
to see them.
240 features is quite a lot given that you only have 2000 observations, even if you have only two classes. You might want to apply a dimension reduction method before LDA, such as PCA (see doc princomp
) or use a feature selection method (see doc sequentialfs
for one such method).