Search code examples
pythonopencvblobrastercentroid

Finding lines from blobs in raster file (python)


I have a binary raster file and want to find straight lines connecting several points in python. The lines should be straight connecting points but are able to vary in the angle of the line. Here is an example of the binary raster, in which I search for connecting lines:

raster example

I was thinking that there is a good chance for a python library helping with it. So far I just found things like RANSAC or linear regression but I do not want to find only one line for all points neither do I want to group several points manually before generating the lines (thinking that chances are too high for missing possibilities out).

I thought about building the centroids of the blobs and afterwards connecting suitable centroids to straight lines. At this point my two main problems are:

  1. How to divide the blobs and get their centroids
  2. How to iterate through the the centroids in an effective pattern without fearing to miss or double lines

For 1. I tried using opencvs' findcontour but I received an error which told me that I cannot use multi banded images (I am working with tif)


Solution

  • Here's an idea for a "brute force" type of approach. I haven't thought too much about how long it might take, or parallelising it for the moment.

    • find all the blobs and their centroids, using findContours(). You can get rid of your error messages about multi-channel images by making it greyscale using cvtColor(..., cv2.COLOR_BGR2GRAY)
    • perform a "Connected Components" analysis with cv2.connectedComponents() so that all the pixels of each blob get assigned the same label. I mean all pixels of top-left blob will be 1, all pixels of next blob will be 2, all pixels of next blob will be 3. Make this of type np.uint16 to accommodate up to a maximum of 32,767 blobs (to which we are going to add a further 32,768 in a minute)
    • Iterate over all blob-centroids as follows
    • Iterate over all angles 0..179, maybe using 5 degree steps during testing and 1 degree in production
    • Use scikit-image drawLine() to generate the coordinates of a line at this angle starting at this centroid and extending to way outside the image bounds - these are "rays" radiating from the centroid
    • Add 32,768 to all pixels on this ray
    • Use np.unique() to find the number of unique values in the image. If you subtract the number of blobs, you will find how many blobs this line at this angle from this centroid intersects with.
    • If that's above a threshold, say 3 (meaning it intersects with 3 blobs) save it.

    Here's a more concrete example...

    Let's say you have 10 blobs. All pixels of the first blob will get labelled as 0. All pixels of the second blob will get labelled as 1, and so on. Your labelled image will have 10 unique values or maybe 11 with a background. Now starting at centroid of first blob, generate lines at each angle from 0..179. For each line, get the coordinates of all points on the line and add 32,768 to those points. Now count the number of unique values of your image. If the line intersected blob 5, there will be the value 32768+5 in your image. If it interesected blob 8, there will be the new value 32768+8 in your image. So now the number of unique values in your image will be the original 10 plus the two new ones (32768+5 and 32768+8) so you'll know this line intersects 2 blobs.