Search code examples
pythonarraysnumpycombinationscell-array

Python: Cell arrays comparison using minus function


I have 3 cell arrays with each cell array have different sizes of array. How can I perform minus function for each of the possible combinations of cell arrays?

For example:

import numpy as np
a=np.array([[np.array([[2,2,1,2]]),np.array([[1,3]])]])
b=np.array([[np.array([[4,2,1]])]])
c=np.array([[np.array([[1,2]]),np.array([[4,3]])]])

The possible combination here is a-b, a-c and b-c.
Let's say a - b:

a=2,2,1,2 and 1,3

b=4,2,1


The desired result come with shifting windows due to different size array:

(2,2,1)-(4,2,1) ----> -2,0,0
(2,1,2)-(4,2,1) ----> -2,-1,1
(1,3)  -(4,2)   ----> -3,1,1
(1,3)  -(2,1)   ----> 4,-1,2

I would like to know how to use python create shifting window that allow me to minus my cell arrays.


Solution

  • I think this pair of functions does what you want. The first may need some tweaking to get the pairing of the differences right.

    import numpy as np
    
    def diffs(a,b):
        # collect sliding window differences
        # length of window determined by the shorter array
        # if a,b are not arrays, need to replace b[...]-a with
        # a list comprehension
        n,m=len(a),len(b)
        if n>m:
            # ensure s is the shorter
            b,a=a,b # switch
            n,m=len(a),len(b)
            # may need to correct for sign switch
        result=[]
        for i in range(0,1+m-n):
            result.append(b[i:i+n]-a)
        return result
    
    def alldiffs(a,b):
        # collect all the differences for elements of a and b
        # a,b could be lists or arrays of arrays, or 2d arrays
        result=[]
        for aa in a:
            for bb in b:
                result.append(diffs(aa,bb))
        return result
    
    # define the 3 arrays
    # each is a list of 1d arrays
    
    a=[np.array([2,2,1,2]),np.array([1,3])]
    b=[np.array([4,2,1])]
    c=[np.array([1,2]),np.array([4,3])]
    
    # display the differences
    print(alldiffs(a,b))
    print(alldiffs(a,c))
    print(alldiffs(b,c))
    

    producing (with some pretty printing):

    1626:~/mypy$ python stack30678737.py 
    [[array([-2,  0,  0]), array([-2, -1,  1])], 
     [array([ 3, -1]), array([ 1, -2])]]
    
    [[array([1, 0]), array([ 1, -1]), array([0, 0])], 
     [array([-2, -1]), array([-2, -2]), array([-3, -1])], 
     [array([ 0, -1])], [array([3, 0])]]
    
    [[array([3, 0]), array([ 1, -1])], 
     [array([ 0, -1]), array([-2, -2])]]
    

    Comparing my answer to yours, I wonder, are you padding your shorter arrays with 0 so the result is always 3 elements long?

    Changing a to a=[np.array([2,2,1,2]),np.array([0,1,3]),np.array([1,3,0])]

    produces:

    [[array([-2,  0,  0]), array([-2, -1,  1])], 
     [array([ 4,  1, -2])], [array([ 3, -1,  1])]]
    

    I suppose you could do something fancier with this inner loop:

    for i in range(0,1+m-n):
        result.append(b[i:i+n]-a)
    

    But why? The first order of business is to get the problem specifications clear. Speed can wait. Besides sliding window code in image packages, there is a neat striding trick in np.lib.stride_tricks.as_strided. But I doubt if that will save time, especially not in small examples like this.