I am trying to manually implement a sobel operator.
For some reason, the horizontal and vertical components of the operator seem to have good results, but the combined image has a lot of noise.
I notice when I do something like (imgv**2)**0.5, that also introduces a ton of noise, even though ideally, I should get approximately the same image back.
Does anyone know what's going on here? Am I supposed to combine the images a different way?
Here is my code in python:
import cv2
import numpy as np
sobelX = np.array([[1,0,-1],[2,0,-2],[1,0,-1]])
sobelY = sobelX.T
imgoriginal = cv2.imread("building.bmp")
imgv = cv2.filter2D(imgoriginal, -1, sobelY)
imgh = cv2.filter2D(imgoriginal, -1, sobelX)
imgboth = (imgv**2 + img**2)**0.5
This is the output:
Update: A better method.
#!/usr/bin/python3
# 2017.12.22 21:48:22 CST
import cv2
import numpy as np
## parameters
sobelX = np.array([[1,0,-1],[2,0,-2],[1,0,-1]])
sobelY = sobelX.T
ddepth = cv2.CV_16S
## calc gx and gy
#img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = cv2.GaussianBlur(img, (3,3), 0)
gx = cv2.filter2D(img, ddepth, sobelX)
gy = cv2.filter2D(img, ddepth, sobelY)
## calc gridxy
gxabs = cv2.convertScaleAbs(gx)
gyabs = cv2.convertScaleAbs(gy)
grad = cv2.addWeighted(gxabs, 0.5, gyabs, 0.5, 0)
cv2.imwrite("result.png", grad)
Original answer:
Yeah, it has troubled me when doing math operation on the opencv image in numpy. The image data type is np.uint8 defaultly. So, when it may overflow/underflow when doing math operation if you don't change the percision.
Try this:
import cv2
import numpy as np
sobelX = np.array([[1,0,-1],[2,0,-2],[1,0,-1]])
sobelY = sobelX.T
img = cv2.imread("cat.png")
## Change the color space
#img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
imgv = cv2.filter2D(img, -1, sobelY)
imgh = cv2.filter2D(img, -1, sobelX)
## Change the percision first, then do math operation
imghv = (np.float32(imgv)**2 + np.float32(img)**2)**0.5
#imghv = (np.float32(imgv)**2 + np.float32(img)**2)**0.5
## Normalize and change the percision
## Use cv2.convertScaleAbs() to convert value into the right range [0, 255]
imghv = imghv/imghv.max()*255
imghv = cv2.convertScaleAbs(imghv)
## Display
res = np.hstack((imgh, imgv, imghv))
cv2.imshow("Sobel", res)
cv2.waitKey()
cv2.destroyAllWindows()