SGDClassifier save loss from every iteration to array

When I train a SGDClassifier in scikit-learn, I can print out the loss value from every iteration (setting verbosity). How to store the values into an array?

Solution

Modifying the answer from this post.

import numpy as np
from io import StringIO
import matplotlib.pyplot as plt
from sklearn.linear_model import SGDClassifier
from tensorflow.keras.datasets import mnist

(x_tr, y_tr), (x_te, y_te) = mnist.load_dataset()
x_tr, x_te = x_tr.reshape(-1, 784), x_te.reshape(-1, 784)

Intercept the printed output by the SGDClassifier

old_stdout = sys.stdout
sys.stdout = mystdout = StringIO()

Set the model to print its output by setting verbose to 1.

clf = SGDClassifier(verbose=1)
clf.fit(x_tr, y_tr)

Get the output of SGDClassifier verbosity

sys.stdout = old_stdout
loss_history = mystdout.getvalue()

Create a list to store the loss values

loss_list = []

Append the loss values printed which is stored in loss_history

for line in loss_history.split('\n'):
    if(len(line.split("loss: ")) == 1):
        continue
    loss_list.append(float(line.split("loss: ")[-1]))

Just to show the graph

plt.figure()
plt.plot(np.arange(len(loss_list)), loss_list)
plt.xlabel("Time in epochs"); plt.ylabel("Loss")
plt.show()

To save the loss values to an array,

loss_list = np.array(loss_list)