SVD with numpy - intepretation of results

I'm trying to get into Singular Value Decomposition (SVD). I've found this YouTube Lecture that contains an example. However, when I try this example in numpy I'm getting "kind of" different results. In this example the input matrix is

A = [ [1,1,1,0,0], [3,3,3,0,0], [4,4,4,0,0], [5,5,5,0,0], [0,2,0,4,4], [0,0,0,5,5], [0,1,0,2,2] ]
A = np.asarray(A)
print(A)

[[1 1 1 0 0]
 [3 3 3 0 0]
 [4 4 4 0 0]
 [5 5 5 0 0]
 [0 2 0 4 4]
 [0 0 0 5 5]
 [0 1 0 2 2]]

The rank of this matrix is 3 (np.linalg.matrix_rank(A)). The lecture states that the number of singular values are the rank of the matrix, and in the example the Sigma matrix S is indeed of size 3=3. However, when I perform

U, S, V = np.linalg.svd(A)

matrix S contains 5 values. On the other hand, the first 3 values match the one in the example, and the other 2 are basically 0. Can I assume that get more singular values than the the rank because of the numerical algorithm behind SVD and the finite representation of real numbers on computers - or something along that line?

Solution

As mentioned on this page, numpy internally uses LAPACK routine _gesdd to get the SVD decomposition. Now, if you see _gesdd documentation, it mentions,

To find the SVD of a general matrix A, call the LAPACK routine ?gebrd or ?gbbrd for reducing A to a bidiagonal matrix B by a unitary (orthogonal) transformation: A = QBPH. Then call ?bdsqr, which forms the SVD of a bidiagonal matrix: B = U1ΣV1H.

So, there are 2 steps involved here :

Bidiagonalization by orthogonal transformation(Householder transformations)
Get the SVD of the bidiagonal matrix, using implicit zero-shift QR algorithm.

QR algorithm is an iterative algorithm, meaning you don't get an "exact" answer, but get better and better approximations with each iteration and stop if the change in values fall below a threshold, so it is "approximate" in that sense.

Thus along with the issue of numerical accuracies due to finite machine representation of reals, even if we had infinite representational capacity, we would have gotten "approximate" results(if we ran the algorithm for finite time) due to the iterative nature of the algorithm.