I was completing the first course of the deeplearning specialization, where the first programming assignment was to build a logistic regression model from scratch. Since it was the first time for me to build a model from scratch and it took me some time to gulp the advanced mathematics, I had lots of errors. Among them, I found a one I am completely unable to fix and just cannot understand. It was an assertion error saying that the shape of dw (derivative of cost with respect to weight) is actually wrong.
The codes :
import numpy as np
def sigmoid(x):
return 1 / 1 + np.exp(x)
def propagate(w, b, X, Y):
m = X.shape[1]
A = sigmoid(np.dot(w.T,X) + b)
cost = np.sum(np.abs(Y * np.log(A) + (1-Y)*(np.log(1-A))) / m)
dw = np.dot(X,(A-Y).T) / m
db = np.sum(A - Y) /m
cost = np.squeeze(np.array(cost))
grads = {"dw": dw,"db": db}
return grads, cost
def optimize(w, b, X, Y, num_iterations=100, learning_rate=0.009, print_cost=False):
w = copy.deepcopy(w)
b = copy.deepcopy(b)
costs = []
for i in range(num_iterations):
grads, cost = propagate(w, b ,X, Y)
dw = grads["dw"]
db = grads["db"]
w = w - learning_rate * grads["dw"]
b = b - learning_rate * grads["db"]
if i % 100 == 0:
costs.append(cost)
if print_cost:
print ("Cost after iteration %i: %f" %(i, cost))
params = {"w": w,
"b": b}
grads = {"dw": dw,"db": db}
return params, grads, costs
def predict(w, b, X):
m = X.shape[1]
Y_prediction = np.zeros((1, m))
w = w.reshape(x[0], 1)
A = sigmoid(np.dot(w.T, X) + b)
for i in range(A.shape[1]):
if A[0, i] > 0.5:
Y_prediction[0,i] = 1.0
else:
Y_prediction[0,i] = 0.0
return Y_prediction
def model(X_train, Y_train, X_test, Y_test, num_iterations=2000, learning_rate=0.5, print_cost=False):
w = np.zeros(shape=(X_train.shape[0],1))
b = np.zeros(shape=(1,1))
params, gards, costs = optimize(w, b, X_train, Y_train)
b = params["b"]
w = params["w"]
Y_prediction_train = predict(w, b, X_train)
Y_prediction_test = predict(w, b, X_test)
d = {"costs": costs,
"Y_prediction_test": Y_prediction_test,
"Y_prediction_train" : Y_prediction_train,
"w" : w,
"b" : b,
"learning_rate" : learning_rate,
"num_iterations": num_iterations}
return d
model_test(model)
the model_test function wasn't defined anywhere in the course and I think that it was built-in to the exercise I guess.. But here's the issue:
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-36-7f17a31b22cb> in <module>
----> 1 model_test(model)
~/work/release/W2A2/public_tests.py in model_test(target)
117 assert type(d['w']) == np.ndarray, f"Wrong type for d['w']. {type(d['w'])} != np.ndarray"
118 assert d['w'].shape == (X.shape[0], 1), f"Wrong shape for d['w']. {d['w'].shape} != {(X.shape[0], 1)}"
--> 119 assert np.allclose(d['w'], expected_output['w']), f"Wrong values for d['w']. {d['w']} != {expected_output['w']}"
120
121 assert np.allclose(d['b'], expected_output['b']), f"Wrong values for d['b']. {d['b']} != {expected_output['b']}"
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-36-7f17a31b22cb> in <module>
----> 1 model_test(model)
~/work/release/W2A2/public_tests.py in model_test(target)
117 assert type(d['w']) == np.ndarray, f"Wrong type for d['w']. {type(d['w'])} != np.ndarray"
118 assert d['w'].shape == (X.shape[0], 1), f"Wrong shape for d['w']. {d['w'].shape} != {(X.shape[0], 1)}"
--> 119 assert np.allclose(d['w'], expected_output['w']), f"Wrong values for d['w']. {d['w']} != {expected_output['w']}"
120
121 assert np.allclose(d['b'], expected_output['b']), f"Wrong values for d['b']. {d['b']} != {expected_output['b']}"
AssertionError: Wrong values for d['w']. [[ 0.28154433]
[-0.11519574]
[ 0.13142694]
[ 0.20526551]] != [[ 0.00194946]
[-0.0005046 ]
[ 0.00083111]
[ 0.00143207]]
At this point I am completely lost and I have no idea what to do..
The problem comes from this line:
params, gards, costs = optimize(w, b, X_train, Y_train)
You will still need to specify the parameters for the optimize function. Ignoring the last parameter will let the model use the default values, which do not equal the parameters specified in the model. So the line above should be:
params, grads, costs = optimize(w, b, X_train, Y_train, num_iterations, learning_rate, print_cost=print_cost)