Search code examples
pythonnumpylogistic-regression

Logistic regression: plotting decision boundary from theta


I have the following code:

x1 = np.random.randn(100)
y1 = np.random.randn(100) + 3
x2 = np.random.randn(100) + 3
y2 = np.random.randn(100)
plt.plot(x1, y1, "+", x2, y2, "x")
plt.axis('equal')
plt.show()

which results in the following image

enter image description here

I have implemented my own logistic regression, and this returns a theta, and I want to use this theta to plot the decision boundary, but I'm not sure how to do this.

X = np.matrix(np.vstack((np.hstack((x1,x2)), np.hstack((y1,y2)))).T)
X = np.concatenate((np.ones((X.shape[0], 1)), X), axis=1)
Y = np.matrix(1.0 * np.hstack((np.zeros(100), np.ones(100)))).T

learning_rate = 0.0001
iterations    = 3000
theta         = np.matrix([[0.5], [0.5], [0.5]])
theta = logistic_regression(theta, X, Y, learning_rate, iterations)

and this gives theta =

[[ 0.40377942]
 [ 0.53696461]
 [ 0.1398419 ]]

for example. How can I use this to plot the decision boundary?


Solution

  • You want to plot θTX = 0, where X is the vector containing (1, x, y). That is, you want to plot the line defined by theta[0] + theta[1]*x + theta[2]*y = 0. Solve for y:

    y = -(theta[0] + theta[1]*x)/theta[2]
    

    So, something like:

    theta = theta[:,0]  # Make theta a 1-d array.
    x = np.linspace(-6, 6, 50)
    y = -(theta[0] + theta[1]*x)/theta[2]
    plt.plot(x, y)
    

    Something doesn't look right, though, because you have theta[1] > 0 and theta[2] > 0, which results in a line with a negative slope.