Could anyone explain the return of linalg.lstsq in details?

Though the linalg.lstsq document is offered. I still feel hard to understand since it is not quite detailed.

x : {(N,), (N, K)} ndarray

Least-squares solution. If b is two-dimensional, the solutions are in the K columns of x.

residuals : {(1,), (K,), (0,)} ndarray

Sums of residuals; squared Euclidean 2-norm for each column in b - a*x. If the rank of a is < N or M <= N, this is an empty array. If b is 1-dimensional, this is a (1,) shape array. Otherwise the shape is (K,).

rank : int

Rank of matrix a.

s : (min(M, N),) ndarray

Singular values of a.

I tried to observe the output. But I only figure out the rank is 2. For the rest, I don't get why it is so.

x = np.array([0, 1, 2, 3])
y = np.array([-1, 0.2, 0.9, 2.1])
A = np.vstack([x, np.ones(len(x))]).T
print(A)
print('-------------------------')
print(np.linalg.lstsq(A, y, rcond=None))

Gives

[[0. 1.]
 [1. 1.]
 [2. 1.]
 [3. 1.]]
-------------------------
(array([ 1.  , -0.95]), array([0.05]), 2, array([4.10003045, 1.09075677]))

I don't understand what the tuples, "(N, ), (N, K), (1,), (K,), (0,), (M, N)", represent in the document.

For example, np.linalg.lstsq(A, y, rcond=None)[0] would be array([ 1. , -0.95]) How does it relate to {(N,), (N, K)}?

Solution

Those tuples are the possible shapes of inputs and outputs. In your example, A.shape = (4, 2) and y.shape = (4,). Looking at the documentation, M = 4, N = 2, and we are dealing with the cases without K. So the output's shapes should be x.shape = (N,) = (2,), residuals.shape = (1,), s.shape = (min(M, N),) = (2,).

Let's look at the outputs one at a time

>>> x, residuals, rank, s = np.linalg.lstsq(A, y, rcond=None)

x is the least square solution of A @ x = y, so it minimises np.linalg.norm(A @ x - y)**2:

>>> A.T @ (A @ x - y)
array([1.72084569e-15, 2.16493490e-15])

The other outputs are there to tell you how good this solution is and how susceptible it is to numerical errors.

residuals is the squared norm of the mis-match between A @ x and y:

>>> np.linalg.norm(A @ x - y)**2
0.04999999999999995
>>> residuals[0]
0.04999999999999971

rank is the rank of A:

np.linalg.matrix_rank(A)
2
>>> rank
2

s contains the singular values of A

>>> np.linalg.svd(A, compute_uv=False)
array([4.10003045, 1.09075677])
>>> s
array([4.10003045, 1.09075677])

Are you familiar with the mathematical concepts?