machine-learning classification decision-tree entropy

How to find Entropy of Split points when building Decision tree?

Given a binary classification problem:

There are four positive examples and five negative examples. Thus, P(+) = 4/9 and P(−) = 5/9. The entropy of the training examples is −4/9 log2(4/9) − 5/9 log2(5/9) = 0.9911.

For a3, which is a continuous attribute, I want to find the information gain for every split.

So I sort a3 values in ascending order and find their Split points. But how do I calculate their Entropy?

The answer given is:

Information Gain column in above image is just 0.9911 - Entropy.

But how do I find the Entropy?

The formula for Entropy is:

But I'm not understanding how to use this formula to find Entropy of the Split points.

Solution

When you split your data by a3 = 3.5 for example, two of your instances go into one split and the remaining seven instances go into the other split. You should calculate the entropy of each split and then make a weighted average over these two entropies. For a3 = 3.5, the following code in python does it for you:

import numpy as np
entropy1 = -(1/2)*np.log2(1/2) - (1/2)*np.log2(1/2)
entropy2 = -(3/7)*np.log2(3/7) - (4/7)*np.log2(4/7)
entropy = (2/9)*entropy1 + (7/9)*entropy2