I am enrolled in the Machine Learning Specialization course by Andrew Ng on Coursera, where I encountered this function implementing the Gradient descent algorithm.
def gradient_descent(x, y, w_in, b_in, alpha, num_iters, cost_function, gradient_function):
w = copy.deepcopy(w_in) # avoid modifying global w_in
# An array to store cost J and w's at each iteration primarily for graphing later
J_history = []
p_history = []
b = b_in
w = w_in
for i in range(num_iters):
# Calculate the gradient and update the parameters using gradient_function
dj_dw, dj_db = gradient_function(x, y, w , b)
# Update Parameters using equation (3) above
b = b - alpha * dj_db
w = w - alpha * dj_dw
# Save cost J at each iteration
if i<100000: # prevent resource exhaustion
J_history.append( cost_function(x, y, w , b))
p_history.append([w,b])
# Print cost every at intervals 10 times or as many iterations if < 10
if i% math.ceil(num_iters/10) == 0:
print(f"Iteration {i:4}: Cost {J_history[-1]:0.2e} ",
f"dj_dw: {dj_dw: 0.3e}, dj_db: {dj_db: 0.3e} ",
f"w: {w: 0.3e}, b:{b: 0.5e}")
return w, b, J_history, p_history #return w and J,w history for graphing`
Could anyone please explain to me the second if statement within the for-loop?
I am getting the actual purpose of that conditional statement? I do understand that it is to print something out on the console, but what does the following condition signify in this case?
if i% math.ceil(num_iters/10) == 0:
If you deconstruct i% math.ceil(num_iters/10) == 0
:
num_iters/10
is the number of iterations divided by 10.math.ceil
returns the rounded-up number, in order for it to be an integer.%
is the modulo operator, so it returns the remainder of the division.== 0
, if the remainder of the division is 0, it means i
is a multiple of num_iters/10
.Overall, this expression is True
when i
is at a decile of num_iters.
For example if num_iters = 200, this will print ten times, when i = 20, 40, 60, ... , 180, 200