Search code examples
pythonnumpyleast-squares

Could anyone explain the return of linalg.lstsq in details?


Though the linalg.lstsq document is offered. I still feel hard to understand since it is not quite detailed.

x : {(N,), (N, K)} ndarray

Least-squares solution. If b is two-dimensional, the solutions are in the K columns of x.

residuals : {(1,), (K,), (0,)} ndarray

Sums of residuals; squared Euclidean 2-norm for each column in b - a*x. If the rank of a is < N or M <= N, this is an empty array. If b is 1-dimensional, this is a (1,) shape array. Otherwise the shape is (K,).

rank : int

Rank of matrix a.

s : (min(M, N),) ndarray

Singular values of a.

I tried to observe the output. But I only figure out the rank is 2. For the rest, I don't get why it is so.

x = np.array([0, 1, 2, 3])
y = np.array([-1, 0.2, 0.9, 2.1])
A = np.vstack([x, np.ones(len(x))]).T
print(A)
print('-------------------------')
print(np.linalg.lstsq(A, y, rcond=None))

Gives

[[0. 1.]
 [1. 1.]
 [2. 1.]
 [3. 1.]]
-------------------------
(array([ 1.  , -0.95]), array([0.05]), 2, array([4.10003045, 1.09075677]))

I don't understand what the tuples, "(N, ), (N, K), (1,), (K,), (0,), (M, N)", represent in the document.

For example, np.linalg.lstsq(A, y, rcond=None)[0] would be array([ 1. , -0.95]) How does it relate to {(N,), (N, K)}?


Solution

  • Those tuples are the possible shapes of inputs and outputs. In your example, A.shape = (4, 2) and y.shape = (4,). Looking at the documentation, M = 4, N = 2, and we are dealing with the cases without K. So the output's shapes should be x.shape = (N,) = (2,), residuals.shape = (1,), s.shape = (min(M, N),) = (2,).

    Let's look at the outputs one at a time

    >>> x, residuals, rank, s = np.linalg.lstsq(A, y, rcond=None)
    

    x is the least square solution of A @ x = y, so it minimises np.linalg.norm(A @ x - y)**2:

    >>> A.T @ (A @ x - y)
    array([1.72084569e-15, 2.16493490e-15])
    

    The other outputs are there to tell you how good this solution is and how susceptible it is to numerical errors.

    residuals is the squared norm of the mis-match between A @ x and y:

    >>> np.linalg.norm(A @ x - y)**2
    0.04999999999999995
    >>> residuals[0]
    0.04999999999999971
    

    rank is the rank of A:

    np.linalg.matrix_rank(A)
    2
    >>> rank
    2
    

    s contains the singular values of A

    >>> np.linalg.svd(A, compute_uv=False)
    array([4.10003045, 1.09075677])
    >>> s
    array([4.10003045, 1.09075677])
    

    Are you familiar with the mathematical concepts?