Search code examples
pythonhough-transform

Explanation about hough_line_peaks after using hough_line


Could anyone please explain why after using h,theta,d = hough_line(image) which will detect the angles in an image do we have to use this for loop for _, a, d in zip(*hough_line_peaks(h, theta, d)): angle.append(a) #angle is a list to add the angles to the list? Since the hough_line(image) will store the angles in the theta variable why can't we use it to retrieve the angles and add them to a list?

Here's a snippet of the code to detect angles of lines in an image:

image = imread(file_name)  
image = np.mean(image, axis=2)  

h, theta, d = hough_line(image)  

angle = []  

for _, a, d in zip(*hough_line_peaks(h, theta, d)):
        angle.append(a)  

angle = [a * 180 / np.pi for a in angle]

Solution

  • So what's happening here is the following:

    • detect all lines on the image using hough_lines.
      • this returns accumulator values, angles, in radians, and distance from origin, let's say for a 1000 peaks in Hough space it identifies as lines.
    • select only the most prominent lines among all detected lines
      • those peaks whose distance from origin is at least 9 "resolution" steps away from another similar peak
      • those peaks which theta coordinate is at least 10 angle "resolution" steps away from another similar peak
      • those whose accumulator count is larger than max_count/2 in the whole Hough space
      • this picks lets say 200 out of the original 1000 lines as more likely to actually be lines in the image.
      • this is effectively a filter on all the returned hough_lines that picks those that are most likely to be lines. So you also get back accumulator values, angles, in radians, and distance from origin.
    • Then you copy all the angles you got back from hough_line_peaks into a new list called a
    • then you use another for loop to iterate through a and create a new list called angle in which you store the values from a converted to degrees

    So as you see there's probably room for improvement. First there is no need to copy the entire list of angles. Second, you're actually doing more harm like that because now you can not use vectorized operations that numpy gives you. The array returned from hough_line_peaks is a numpy ndarray which you can just multiply directly.

    import skimage
    
    image = skimage.io.imread(file_name)  
    image = np.mean(image, axis=2)  
    
    h, theta, d = skimage.transform.hough_line(image)
    bestH, bestTheta, bestD = skimage.transform.hough_line_peaks(h, theta, d)
    angle = bestTheta * (180/np.pi)
    

    Should give you the same results, and probably a bit faster too. This whole line zip(*hough_line_peaks(h, theta, d)) should always be suspect because * is called the "unpacking" operator. So given a tuple, list, or any other iterable (a, b, c) the unpacking operator unpacks the iterable *(a, b, c) --> a, b, c so then zipping them back again just packs them back again *(a, b, c) --> a, b, c --> (a, b, c). These constructs should always be a little bit suspicious. The underline in that for loop for _, a, b in... is just throwing that value away as nameless. And the final [ a for a in...] construct is called list comprehension and works exactly as a for loop but is just syntactic sugar to make things more readable when possible.

    EDIT

    Answer to extra question in comment. Lets say:

    >> import numpy as np
    >>> a = np.array([[1, 2], [3, 4]])
    >>> a
    array([[1, 2],
           [3, 4]])
    
    >>> a.mean(axis=0)
    array([2., 3.])
    
    >>> a.mean(axis=1)
    array([1.5, 3.5])
    

    So the mean on axis 0 is doing (1+3)/2=2 and (2+4)/2=3 - so effectively axis=0 is the vertical slice across a. The axis=1 is then a horizontal slice across a so that (1+2)/2=1.5 and (3+4)/2=3.5. The same goes for any larger dimension matrix of elements with addendum that you just have more dimensions. So for 2D matrix its vertical+horizontal for 3D it will be vertical+horizontal+depth. So axis 2 is summing on "depth".

    Why this is necessary is that hough_lines can only work with intensity data. It doesn't understand colors. But in a color image each pixel is actually 3 numbers (r, b, g) representing the red, blue and green channel intensities. The mix of these intensities eventually produce color on display. So the mean(axis=2) line is instructing numpy to take the 3D matrix representing an image and return to you a 2D matrix where each pixel intensity value has been replaced by the average of the 3 color intensity values.

    So an color image:

    index      1           2     ....
         ___________________________________
      1  | [1, 2, 3]  [4, 5, 6] ...
      2  | [7, 8, 9] .          ...
      .  |     .       .
      .  |     .         .
      .  |     .           .
    

    becomes a grayscale image:

    index  1  2
        ______________________________
      1 |  2  5 ...
      2 |  8  . ...
      3 |  .   .
        |  .    .
        |  .     .
        |