I have a 2d
numpy
array
containing greyscale
pixel values from 0
to 255
. What I want to do is to create a gaussian filter
from scratch. I have already written a function to generate a normalized
gaussian kernel
:
def gaussianKernel(size, sigma):
kernel = np.fromfunction(lambda x, y: (1/(2*math.pi*sigma**2)) * math.e ** ((-1*((x-(size-1)/2)**2+(y-(size-1)/2)**2))/(2*sigma**2)), (size, size))
return kernel / np.sum(kernel)
which works fine:
>>> vision.gaussianKernel(5, 1.5)
array([[ 0.01441882, 0.02808402, 0.0350727 , 0.02808402, 0.01441882],
[ 0.02808402, 0.05470021, 0.06831229, 0.05470021, 0.02808402],
[ 0.0350727 , 0.06831229, 0.08531173, 0.06831229, 0.0350727 ],
[ 0.02808402, 0.05470021, 0.06831229, 0.05470021, 0.02808402],
[ 0.01441882, 0.02808402, 0.0350727 , 0.02808402, 0.01441882]])
So then I created a basic convolution
function to apply this kernel
to each pixel
and produces a gaussian
blur:
def gaussianBlurOld(img, kSize, kSigma):
kernel = gaussianKernel(kSize, kSigma)
d = int((kSize-1)/2)
gaussian = np.zeros((img.shape[0]-2*d, img.shape[1]-2*d))
for y in range(d, img.shape[0]-d):
for x in range(d, img.shape[1]-d):
gaussian[y-d][x-d] = np.sum(np.multiply(img[y-d:y+d+1, x-d:x+d+1], kernel))
return gaussian
Which works fine and blurs an image, however, as this code will be eventually running on a raspberry pi, I need it to be efficient and for it to be much faster. So thanks to this answer on a question I asked yesterday on how to speed up a Sobel
edge detector, I tried to apply the same logic he gave to the gaussian
filter. However, as the function
will accept a variable
size parameter for the kernel
, it complicates things slightly from the set size of the Sobel
kernel which is just 3x3
.
If I understand the explanation correctly, I need to first separate the kernel into x
and y
components which can be done by just using the top row
and left column
of the original kernel
(obviously they are the same, but I decided to just keep them separate as I have the 2d
kernel already calculated). Below is the matrix separated:
From these row
and column
vectors, I need to go through each value and multiply that 'window'
of the array by it element-wise. After each one, shifting the reduced size of the window along the array to the right. To show what I think I need to do clearer, these are the 3 different 'windows'
I am talking about for a small image with a kernel
size of 3x3
:
_______3_______
_____|_2_______ |
_____|_1__|____| | |
| | | | | |
|123,|213,|124,|114,|175|
|235,|161,|127,|215,|186|
|128,|215,|111,|141,|221|
|224,|171,|193,|127,|117|
|146,|245,|129,|213,|221|
|152,|131,|150,|112,|171|
So for each 'window'
, you multiply by the index
of that window in the kernel and add that to the total.
Then, take that img which has had the x
component of the gaussian
kernel applied to it and do the same for the y
component.
These are the steps I think I can do to calculate the gaussian
blur much faster than using nested
for-loops
as above and here is the code that I wrote to try and do it:
def gaussianBlur(img, kSize, kSigma):
kernel = gaussianKernel(kSize, kSigma)
gausX = np.zeros((img.shape[0], img.shape[1] - kSize + 1))
for i, v in enumerate(kernel[0]):
gausX += v * img[:, i : img.shape[1] - kSize + i + 1]
gausY = np.zeros((gausX.shape[0] - kSize + 1, gausX.shape[1]))
for i, v in enumerate(kernel[:,0]):
gausY += v * gausX[i : img.shape[0] - kSize + i + 1]
return gausY
My problem is that this function produces the right 'blurring effect', but the output values are all between 0
and 3
as floats
for some reason. Luckily, for some other reason, matplotlib
can still display the output fine so I can check that it has blurred the image correctly.
The question is just simply: why are the pixel values outputting between 0
and 3
???
I have debugged for hours but cannot spot the reason. I am pretty sure that there is just a little scaling detail somewhere, but I just cant find it. Any help would be much appreciated!
For anyone interested, the problem was from the fact that The function gaussianKernel
returned the 2d
kernel
normalised
for use as a 2d
kernel
. This meant that when I split it up into its row
and column
components by taking the top row
and left column
, these components were not normalised
.
To solve this, I just added a parameter to the gaussianKernel
function to select 2
dimensions or 1
dimensions (both normalised
correctly):
def gaussianKernel(size, sigma, twoDimensional=True):
if twoDimensional:
kernel = np.fromfunction(lambda x, y: (1/(2*math.pi*sigma**2)) * math.e ** ((-1*((x-(size-1)/2)**2+(y-(size-1)/2)**2))/(2*sigma**2)), (size, size))
else:
kernel = np.fromfunction(lambda x: math.e ** ((-1*(x-(size-1)/2)**2) / (2*sigma**2)), (size,))
return kernel / np.sum(kernel)
So now I can get just the 1d
kernel
with gaussianKernel(size, sigma, False)
, and have it be normalised
correctly. This means I can finally get the right blurring effect without scaled
pixel
values.