Search code examples
pythonimagenumpymathleast-squares

Multivariate Optimization - scipy.optimize input parsing error


rgb image

I have the above rgb image saved as tux.jpg. Now I want to get the closest approximation to this image that is an outer product of two vectors I.e of the form A·BT.

Here is my code -

#load image to memory
import Image
im = Image.open('tux.jpg','r')
#save image to numpy array
import numpy as np
mat = np.asfarray(im.convert(mode='L')) # mat is a numpy array of dimension 354*300
msizex,msizey = mat.shape
x0 = np.sum(mat,axis=1)/msizex
y0 = np.sum(mat,axis=0)/msizey

X0 = np.concatenate((x0,y0)) # X0.shape is (654,)
# define error of outer product with respect to original image
def sumsquares(X):
    """ sum of squares - 
    calculates the difference between original and outer product

    input X is a 1D numpy array with the first 354 elements
    representing vector A and the rest 300 representing vector B.
    The error is obtained by subtracting the trial  $A\cdot B^T$ 
    from the original and then adding the square of all entries in 
    the matrix.
    """
    assert X.shape[0] == msizex+msizey
    x = X0[:msizex]
    y = X0[msizex:]
    return np.sum(
        (
        np.outer(x,y) - mat
        )**2
    )
#import minimize
from scipy.optimize import minimize
res = minimize(sumsquares, X0,
               method='nelder-mead',
               options={'disp':True}
)
xout = res.x[:msizex]
yout = res.x[msizex:]
mout = np.outer(xout,yout)
imout= Image.fromarray(mout,mode='L')
imout.show()

The result is the following image.

     Optimization terminated successfully.
     Current function value: 158667093349733.531250
     Iterations: 19
     Function evaluations: 12463

This doesn't look good enough to me. Is there any way to improve this? The noise in the output is not even of the same length as the structures in the original picture. My guess is that the algorithm isn't going through. How can I debug or improve this?

EDIT1: I created the image below with the code

size = 256
mat0 = np.zeros((size,size))
mat0[size/4:3*size/4,size/4:3*size/4] = 1000
#mat0[size/4:3*size/4,] = 1000
#mat0[:3*size/4,size/4:] = 1000

im0 = Image.fromarray(mat0)
im0.show() 

The two commented out lines result in two other images. Here are the result of my experiments -

  1. Square in the middle. Input - square middle
    Output - Same
  2. Band in the middle. Input - band middle
    Output - square middle
  3. White chunk to the North East Input - north east
    Output- north west

While this is much better than what I expected, cases 2 and 3 still end up being wrong. I hope that the arguments to the minimize function mean what I think they mean.


Solution

  • 1) The rendering problem of the first image seems to be an issue in the conversion from numpy array to image. I get the right rendering by running:

    imout = Image.fromarray(mout/np.max(mout)*255)
    

    (i.e. normalize the image to a maximum value of 255 and let it determine the mode automatically).

    In general, to check that Image.fromarray is working, it is useful to compare the output of imout.show() with

    import matplotlib.pyplot as plt
    plt.matshow(mout/np.max(mout)*255, cmap=plt.cm.gray)
    

    and you should get the same results. BTW, by doing that, I get all the 3 other cases correct.

    2) Secondly, the main problem with tux.png is that it is not possible to reconstruct an image with such a detailed structure with only an outer product of two 1-D vectors.

    (This tends to work for simple images such as the blocky ones shown above, but not for an image with few symmetries and many details).

    To prove the point:

    • There exist matrix factorization techniques that allow reconstructing a matrix as the product of two low-rank matrices M=AB, such as sklearn.decomposition.NMF.

    • In this case, setting the rank of A and B to 1 would be equivalent to your problem (with a different optimization technique).

    • Playing with the code below you can easily see that with n_components=1 (which is equivalent to an outer product of two 1-D vectors), the resulting reconstructed matrix looks very similar to the one outputted by your method, and that with bigger n_components, the better the reconstruction.

    For reproducibility:

    import matplotlib.pyplot as plt
    from sklearn.decomposition import NMF
    
    nmf = NMF(n_components=20)
    prj = nmf.fit_transform(mat)
    
    out = prj.dot(nmf.components_)
    out = np.asarray(out, dtype=float)
    
    imout = Image.fromarray(out)
    imout.show()
    

    For illustration, this is the NMF reconstruction with 1 component (this is exactly an outer product between two 1-D vectors):

    NMF reconstruction with 1 component, equivalent to outer product of 2 vectors

    With 2 components:

    2 components

    And this is the NMF reconstruction with 20 components.

    NMF reconstruction with 20 components, equivalent to the sum of 20 1-D vector outer products)

    Which clearly indicates that a single 1-D outer product is not enough for this images. However it works for the blocky images.

    If you are not restricted to an outer product of vectors, then matrix factorization can be an alternative. BTW, there exist a vast number of matrix factorization techniques. Another alternative in sklearn is SVD.

    3) Finally, there may be a scaling issue in x0 and y0. Note that the elements of np.outer(x0, y0) are orders of magnitude higher than the elements of mat.

    Although I still get good results for the 3 blocky examples with the code provided, in general it is a good practice to have comparable scales when taking square differences. For instance, you might want to scale x0 and y0 so that the norm of np.outer is comparable to the one of mat.