my function that tries to reconstruct an image using SVD with 3 loops sometimes yields different results and tries to calculate the reconstructed image with bad values. Sometimes it works completely fine but if I repeatedly try to reconstruct the same image with the same parameters, it eventually shows bad results. This happens at random and probably not due to variables not being overwritten/deleted correctly after each function call. Sometimes I also get this error message:
/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:21: RuntimeWarning: invalid
value encountered in add
/usr/local/lib/python3.7/dist-packages/matplotlib/image.py:452: RuntimeWarning: overflow
encountered in double_scalars
dv = np.float64(self.norm.vmax) - np.float64(self.norm.vmin)
/usr/local/lib/python3.7/dist-packages/matplotlib/image.py:455: RuntimeWarning: invalid
value encountered in double_scalars
newmin = vmid - dv * fact
/usr/local/lib/python3.7/dist-packages/matplotlib/colors.py:1026: RuntimeWarning:
overflow encountered in double_scalars
resdat /= (vmax - vmin)
Here is my code which reproduces this bug: https://www.kaggle.com/krajnovic/numpy-variance
I debugged it for quite some time and I think the issue happens at the last summation of the "reco" object.
You didn't paste your relevant code from kaggle here, so I am pasting it here for you.
import os
import imageio
import numpy as np
import matplotlib.pyplot as plt
im = imageio.imread("/kaggle/input/numpy-variance111/m1-1_slice125.png")
im = im -im.min() / im.max() - im.min()
u,s,vt = np.linalg.svd(im, full_matrices=False)
k = 20
def reconstruct_svd_for_loops3(u,s,vt,k):
"""SVD reconstruction for k components using 3 for-loops
for + in svd for zeile for spalte
Inputs:
u: (m,n) numpy array
s: (n) numpy array (diagonal matrix)
vt: (n,n) numpy array
k: number of reconstructed singular components
Ouput:
(m,n) numpy array U_mk * S_k * V^T_nk for k reconstructed components
"""
### BEGIN SOLUTION
reco = np.empty(u.shape)
for i in range(k): #for k components
sum_ = np.empty(u.shape)
for j in range(u[:,i].shape[0]): #for each element in ith column of u
for k in range(vt[i,:].shape[0]): #for each element in ith row of vt
sum_[j,k] = u[j,i] * vt[i,k]
sum_ = s[i] * sum_
reco += sum_
### END SOLUTION
del sum_
return reco
I have found some errors in your code, but only the first listed error is the one causing the randomness.
The reason it works sometimes and sometimes not is because you are using np.empty
, which initializes your reco
and sum_
arrays to garbage values ranging from -inf
to inf
. Thus sometimes when are lucky you may happen to initialize with values that doesn't crash. You can fix thus by using np.zeros
instead.
Additionally, you are specifying incorrect shapes and you have also implemented your outer prdouct (the two inner loops) incorrectly. This does not cause the randomness though, but your reconstruction will be incorrect. See code below. Your code was fine. I misread some things the first time.
Your image normalization is wrong, you are missing some parenthesis'. See code below.
This isn't an error, but a suggestion; you can easily write all of this without any loops by utilizing NumPy broadcasting and matrix multiplication. See code below:
import numpy as np
from PIL import Image
from urllib import request
from matplotlib import pyplot as plt
url = "https://pbs.twimg.com/profile_images/656159226805399552/v6ffWuIc_400x400.jpg"
img = np.array(Image.open(request.urlopen(url))) # Image of owl
def normalize_image(X: np.ndarray):
xmin = X.min()
xmax = X.max()
return (X - xmin) / (xmax - xmin)
X = img[..., 0] # Pick a single channel
X = normalize_image(X)
def reconstruct_loops(u, s, vt, k):
reco = np.zeros(u.shape)
for i in range(k): #for k components
sum_ = np.zeros(u.shape)
for j in range(u[:,i].shape[0]): #for each element in ith column of u
for k in range(vt[i,:].shape[0]): #for each element in ith row of vt
sum_[j,k] = u[j,i] * vt[i,k]
sum_ = s[i] * sum_
reco += sum_
return reco
def reconstruct_vectorized(u, s, vt, k):
return (u[:,:k]*s[:k])@vt[:k,:]
k = 10
u, s, vt = np.linalg.svd(X, full_matrices=False)
reco_loop = reconstruct_loops(u, s, vt, k)
reco_vector = reconstruct_vectorized(u, s, vt, k)
fig, axes = plt.subplots(1,3)
axes[0].set_title("Original")
axes[0].set_axis_off()
axes[0].imshow(X, cmap='gray')
axes[1].set_title("Using\nloops")
axes[1].set_axis_off()
axes[1].imshow(normalize_image(reco_loop), cmap='gray')
axes[2].set_title("Using numpy broadcasting\nand matrix\nmultiplication")
axes[2].set_axis_off()
axes[2].imshow(normalize_image(reco_vector), cmap='gray')
plt.show()