Search code examples
pythonmachine-learningneural-networklogistic-regression

Will a neural network always outperform a logistic regression classifier?


I am just getting started with machine learning and is exploring different algorithms. I took a binary classification problem from the internet and tried applying various machine learning techniques.

First I tried running a naive baysien classifier on it and I found a success rate of about 75%. I tried out logistic regression and found a staggering success rate of 90%. I tried applying regularisation to my classifier and here is the curve that I found when I varied Lambda(the regularisation parameter) over 30 values. plot. The red plot is the training set and the blue one is the validation set. As you can see, the error margin in both the curves are increasing over Lambda. I think this would suggest that my hypothesis is underfit to begin with and the underfitting is getting worse with increase in lambda. Is this the correct way to interpret this?

Either way, in order to tackle the problem of underfitting, it would make sense to try a more complicated model so I turned to a neural network. My initial problem has 31 features characterising it and I choose a network with two hidden layers having 10 nodes each.

After training I found that it is classifying only 65% of the training data correctly. That is worse than the Naive-Baysien and the logistic regression. How often does this happen? Is it more likely that there is something wrong with my implementation of the neural network?

It is also interesting to note that the neural network seems to be converging after just 25-30 iterations. My logistic regression took 300 iterations to converge. I did considered the possibility that the neural network might be getting stuck in a local minima but according to Andrew NG's excellent course on machine learning which I am following, that is rather unlikely.

From what the course explained, the neural network in general, gives out better predictions than a logistic regression but you may run into problems with overfitting. However, I don't think that is the problem here since the 65% success rate is on the training set.

Do I need to go over my neural network implementation or is this a possible thing that can happen?


Solution

  • First, please try larger hidden layers such as 200 nodes each. Then update your result so we can see what is the critical problem.

    When you use a neural network to classify your data, it actually fit a vector space which is suitable to do this task. In this case, suppose your data has 31 dimensions, at least a 32-dimensional space can perfectly classify your data if there is no sample both in positive class and negative class. So if you get a bad performance on training set, just enlarge your neural network until you get 100% result on training set, then you can start to think about generalization problem.