i have some problems with a classification algorithm CART,
my data is look like this, the question is how i can calculated the "goodness of split" use the GINI index when all data is numeric ?
Gini index works for categorical data and it measures the degree or probability of a particular variable being wrongly classified when it is randomly chosen.So for a tree we pick a feature with least Gini index.
Now in your case, we have numerical data so the feature selection for split is done with the elements higher than a threshold.
For calculating the threshold, sort the numerical feature in ascending order and try each value as threshold and calculate the information gain for each value as threshold. The value as threshold with maximum information gain will be your threshold.