I want to implement the convolution using numpy and python by myself:
This is my code:
def add_padding_to_image(img, kernel):
old_image_height, old_image_width, channels = img.shape
padding_width = kernel.shape[0] // 2
new_image_width = img.shape[1] + padding_width * 2
new_image_height = img.shape[0] + padding_width * 2
color = (0,0,0)
result = np.full((new_image_height,new_image_width, channels), color, dtype=np.uint8)
# compute center offset
x_center = (new_image_width - old_image_width) // 2
y_center = (new_image_height - old_image_height) // 2
# copy img image into center of result image
result[y_center:y_center+old_image_height,
x_center:x_center+old_image_width] = img
return result
def convolve(img, kernel):
height, width, color = img.shape
kernel_h, kernel_w= kernel.shape
img = img.copy()
# loop through the image, for each pixel on the image
for y in range(kernel_h//2, height - kernel_h // 2):
y_start = y - kernel_h // 2
y_end = y + kernel_h // 2
for x in range(kernel_w//2, width- kernel_w // 2):
# for each pixel on the image
x_start = x - kernel_w// 2
x_end = x + kernel_w // 2
# get ready for loop through the image pixel * kernel
kx = 0
ky = 0
b_sum = 0
g_sum = 0
r_sum = 0
# loop through the neighbor of the image pixel
for i in range(y_start, y_end + 1):
for j in range(x_start, x_end + 1):
# print("i", i, "j", j, "kx", kx, "ky", ky )
# loop through each neightbor image pixel
color = img[i][j]
# add to the sum
b_sum = b_sum + color[0] * kernel[ky][kx]
g_sum = g_sum + color[1] * kernel[ky][kx]
r_sum = r_sum + color[2] * kernel[ky][kx]
# move to kernel grid
kx = (kx+1)% kernel_w
ky = (ky+1)% kernel_h
img[y][x] = [b_sum, g_sum, r_sum]
return img
# read image
img = cv.imread('cat.jpg')
outline = np.array([
[-1, -1, -1],
[-1, 8, -1],
[-1, -1, -1]
])
img_padding = add_padding_to_image(img, outline)
filter_image = convolve(img_padding, outline)
# # Show the image
cv.imshow("Image", img)
cv.imshow("Filter Image", filter_image)
cv.waitKey(0)
cv.destroyAllWindows()
add_padding_to_image
will add borders around the image.
convolve
is the one doing the convolution.
I am expecting to give me the same result as using cv.filter2D
, for example, if using:
outline = np.array([
[-1, -1, -1],
[-1, 8, -1],
[-1, -1, -1]
])
dst = cv.filter2D(img, -1, outline)
The cat image will be converted from:
But my code gives me this:
I have been stuck for days and don't know where did I lack of. Can anyone look around it?
Thanks!
Firstly on line img[y][x] = [b_sum, g_sum, r_sum]
you write the pixels that you are reading again on the next iterations of your loop.
To fix that, change your img = img.copy()
at the top of convolute()
to returned = img.copy()
, and return returned
instead of return img
at the bottom of it.
Secondly (and lastly) clamp your pixel values between 0-255, to not overflow uint8. Like this:
b_sum = 255 if b_sum>255 else b_sum
g_sum = 255 if g_sum>255 else g_sum
r_sum = 255 if r_sum>255 else r_sum
b_sum = 0 if b_sum<0 else b_sum
g_sum = 0 if g_sum<0 else g_sum
r_sum = 0 if r_sum<0 else r_sum
returned[y][x] = [b_sum, g_sum, r_sum]