When I train a SGDClassifier in scikit-learn, I can print out the loss value from every iteration (setting verbosity). How to store the values into an array?
Modifying the answer from this post.
import numpy as np
from io import StringIO
import matplotlib.pyplot as plt
from sklearn.linear_model import SGDClassifier
from tensorflow.keras.datasets import mnist
(x_tr, y_tr), (x_te, y_te) = mnist.load_dataset()
x_tr, x_te = x_tr.reshape(-1, 784), x_te.reshape(-1, 784)
Intercept the printed output by the SGDClassifier
old_stdout = sys.stdout
sys.stdout = mystdout = StringIO()
Set the model to print its output by setting verbose
to 1
.
clf = SGDClassifier(verbose=1)
clf.fit(x_tr, y_tr)
Get the output of SGDClassifier verbosity
sys.stdout = old_stdout
loss_history = mystdout.getvalue()
Create a list to store the loss values
loss_list = []
Append the loss values printed which is stored in loss_history
for line in loss_history.split('\n'):
if(len(line.split("loss: ")) == 1):
continue
loss_list.append(float(line.split("loss: ")[-1]))
Just to show the graph
plt.figure()
plt.plot(np.arange(len(loss_list)), loss_list)
plt.xlabel("Time in epochs"); plt.ylabel("Loss")
plt.show()
To save the loss values to an array,
loss_list = np.array(loss_list)