There are two semi-circles of width thk with inner radius rad, separated by sep as shown (red is -1 and blue is + 1). The center of the top semi-circle is aligned with the middle of the edge of the bottom semi-circle. This task is linearly separable when sep >= 0, and not so for sep <0. Set rad = 10, thk = 5 and sep = 5 . Then, generate 2,000 examples uniformly, which means you will have approximately 1,000 examples for each class.
Figure Describing the problem:
I used logistic regression to separate the datasets in semicircles. When sep is negative the result of logistic regression is not the best.
I want to add a new nonlinear feature to dataset, after which the logistic regression will produce better result.
The problem you have here is that Logistic regression is a classifier with linear boundaries. To solve this you can use something called the kernel trick which essentially lets you transform your data into a space that is linearly separable.
If you choose a kernel such as the Radial Basis function (RBF) you should be able to the datasets 100% accuracy.
A more simple way would be to add a feature such as x^2y^2 as a 3rd column to your dataset which may give better results but is usually used when your data is an ellipsoid.