Search code examples
tensorflowkerascanny-operator

Conditional value on tensor relative to element neighbors


I am implementing the Canny algorithm using Tensorflow (this is needed to use the borders as an evaluation metric, but this is off topic). One of the steps is to calculate the "Non-maximum Suppression", which consists in zeroing the center element in a 3x3 region, unless two specific neighbors are smaller. More details here.

How can I achieve this operation using Tensorflow?

I am actually using Keras, but the Tensorflow solution will work as well, for reference, my code so far looks like this:

def canny(img):
    '''Canny border detection. The input should be a grayscale image.'''
    gauss_kernel = np.array([[2,  4,  5,  4, 2],
                             [4,  9, 12,  9, 4],
                             [5, 12, 15, 12, 5],
                             [4,  9, 12,  9, 4],
                             [2,  4,  5,  4, 2]]).reshape(5, 5, 1, 1)
    gauss_kernel = K.variable(1./159 * gauss_kernel)

    Gx = K.variable(np.array([[-1., 0. ,1.],
                              [-2., 0., 2.],
                              [-1., 0., 1.]]).reshape(3, 3, 1, 1))

    Gy = K.variable(np.array([[-1., -2., -1.],
                              [ 0.,  0.,  0.],
                              [ 1.,  2.,  1.]]).reshape(3, 3, 1, 1))
    # Smooth image
    smoothed = K.conv2d(img, gauss_kernel, padding='same')
    # Derivative in x
    Dx = K.conv2d(smoothed, Gx, padding='same')
    # Derivative in y
    Dy = K.conv2d(smoothed, Gy, padding='same')
    # Take gradient strength
    G = K.sqrt(K.square(Dx) + K.square(Dy))

    # TODO: Non-maximum Suppression & Hysteresis Thresholding   

    return G

Solution

  • You could use convolutional filters to segregate the two target pixels and make them "concentric" with the central pixel.

    For comparing with two target pixels, for instance, we could use this filter, shaped as (3, 3, 1, 2) -- One input channel, two output channels. Each channel will return one of the target pixels.

    The filter should have 1 at the target pixels. And the rest are zeros:

    #taking two diagonal pixels
    filter = np.zeros((3,3,1,2))
    filter[0,0,0,0] = 1 #first pixel is top/left, passed to the first channel
    filter[2,2,0,1] = 1 #second pixel is bottom/right, passed to the second channel     
        #which ones are really bottom or top, left or right depend on your preprocessing, 
        #but they should be consistent with the rest of your modeling 
    
    filter = K.variable(filter)
    

    If you're taking top and bottom, or left and right, you can make smaller filters. No need to be 3x3 (no problem either), but only 1x3 or 3x1:

    filter1 = np.zeros((1,3,1,2)) #horizontal filter
    filter2 = np.zeros((3,1,1,2)) #vertical filter
    
    filter1[0,0,0,0] = 1    #left pixel -  if filter is 3x3: [1,0,0,0]
    filter1[0,2,0,1] = 1    #right pixel - if filter is 3x3: [1,2,0,1]
    filter1 = K.variable(filter1)
    
    filter2[0,0,0,0] = 1    #top pixel -  if filter is 3x3: [0,1,0,0]
    filter2[2,0,0,1] = 1    #bottom pxl - if filter is 3x3: [2,1,0,1]
    filter2 = K.variable(filter2)
    

    Then you apply these as convolutions. You will get one channel for one pixel, and the other channel for the other pixel. You can then compare them as if they were all in the same place, just in different channels:

    targetPixels = K.conv2d(originalImages, kernel=filter, padding='same')
    
    #two channels telling if the center pixel is greater than the pixel in the channel
    isGreater = K.greater(originalImages,targetPixels)
    
    #merging the two channels, considering they're 0 for false and 1 for true
    isGreater = K.cast(isGreater,K.floatx())
    isGreater = isGreater[:,:,:,:1] * isGreater[:,:,:,1:]
    
    #now, the center pixel will remain if isGreater = 1 at that position:
    result = originalImages * isGreater