Search code examples
pythonalgorithmmatrixgraphjulia

How to find connected components in a matrix using Julia


Assume I have the following matrix (defined here in Julia language):

    mat = [1 1 0 0 0 ; 1 1 0 0 0 ; 0 0 0 0 1 ; 0 0 0 1 1]

Considering as a "component" a group of neighbour elements that have value '1', how to identify that this matrix has 2 components and which vertices compose each one?

For the matrix mat above I would like to find the following result:

Component 1 is composed by the following elements of the matrix (row,column):

    (1,1)
    (1,2)
    (2,1)
    (2,2)

Component 2 is composed by the following elements:

    (3,5)
    (4,4)
    (4,5)

I can use Graph algorithms like this to identify components in square matrices. However such algorithms can not be used for non-square matrices like the one I present here.

Any idea will be much appreciated.

I am open if your suggestion involves the use of a Python library + PyCall. Although I would prefer to use a pure Julia solution.

Regards


Solution

  • Using Image.jl's label_components is indeed the easiest way to solve the core problem. However, your loop over 1:maximum(labels) may not be efficient: it's O(N*n), where N is the number of elements in labels and n the maximum, because you visit each element of labels n times.

    You'd be much better off just visiting each element of labels just twice: once to determine the maximum, and once to assign each non-zero element to its proper group:

    using Images
    
    function collect_groups(labels)
        groups = [Int[] for i = 1:maximum(labels)]
        for (i,l) in enumerate(labels)
            if l != 0
                push!(groups[l], i)
            end
        end
        groups
    end
    
    mat = [1 1 0 0 0 ; 1 1 0 0 0 ; 0 0 0 0 1 ; 0 0 0 1 1]
    
    labels = label_components(mat)
    groups = collect_groups(labels)
    

    Output on your test matrix:

    2-element Array{Array{Int64,1},1}:
     [1,2,5,6] 
     [16,19,20]
    

    Calling library functions like find can occasionally be useful, but it's also a habit from slower languages that's worth leaving behind. In julia, you can write your own loops and they will be fast; better yet, often the resulting algorithm is much easier to understand. collect(zip(ind2sub(size(mat),find( x -> x == value, mat))...)) does not exactly roll off the tongue.