If I have a set of data like this:
Classification attribute-1 attribute-2
Correct dog dog
Correct dog dog
Wrong dog cat
Correct cat cat
Wrong cat dog
Wrong cat dog
Then what is the information gain of attribute-2 relative to attribute-1?
I've computed the entropy of the whole data set: -(3/6)log2(3/6)-(3/6)log2(3/6)=1
Then I'm stuck! I think you need to calculate entropies of attribute-1 and attribute-2 too? Then use these three calculations in an information gain calculation?
Well first you have to calculate the entropy for each of the attributes. After that you calculate the information gain. Just give me a moment and I'll show how it should be done.
for attribute-1
attr-1=dog:
info([2c,1w])=entropy(2/3,1/3)
attr-1=cat
info([1c,2w])=entropy(1/3,2/3)
Value for attribute-1:
info([2c,1w],[1c,2w])=(3/6)*info([2c,1w])+(3/6)*info([1c,2w])
Gain for attribute-1:
gain("attr-1")=info[3c,3w]-info([2c,1w],[1c,2w])
And you have to do the same for the next attribute.