Python sklearn show loss values during training
So I couldn't find very good documentation on directly fetching the loss values per iteration, but I hope this will help someone in the future:
old_stdout = sys.stdout
sys.stdout = mystdout = StringIO()
clf = SGDClassifier(**kwargs, verbose=1)
clf.fit(X_tr, y_tr)
sys.stdout = old_stdout
loss_history = mystdout.getvalue()
loss_list = []
for line in loss_history.split('\n'):
if(len(line.split("loss: ")) == 1):
continue
loss_list.append(float(line.split("loss: ")[-1]))
plt.figure()
plt.plot(np.arange(len(loss_list)), loss_list)
plt.savefig("warmstart_plots/pure_SGD:"+str(kwargs)+".png")
plt.xlabel("Time in epochs")
plt.ylabel("Loss")
plt.close()
This code will take a normal SGDClassifier(just about any linear classifier), and intercept the verbose=1
flag, and will then split to get the loss from the verbose printing. Obviously this is slower but will give us the loss and print it.
Use model.loss_curve_
.
You can use the verbose
option to print the values on each iteration but if you want the actual values, this is not the best way to proceed because you will need to do some hacky stuff to parse them.
It's true, the documentation doesn't mention anything about this attribute, but if you check in the source code, you may notice that one of MLPClassifier
base classes (BaseMultilayerPerceptron
) actually defines an attribute loss_curve_
where it stores the values on each iterarion.
As you get all the values in a list, plotting should be trivial using any library.
Notice that this attribute is only present while using a stochastic solver (i.e. sgd
or adam
).
I just adapted and updated the answer from @OneRaynyDay. Using context manager is way more elegant.
Defining Context Manager:
import sys
import io
import matplotlib.pyplot as plt
class DisplayLossCurve(object):
def __init__(self, print_loss=False):
self.print_loss = print_loss
"""Make sure the model verbose is set to 1"""
def __enter__(self):
self.old_stdout = sys.stdout
sys.stdout = self.mystdout = io.StringIO()
def __exit__(self, *args, **kwargs):
sys.stdout = self.old_stdout
loss_history = self.mystdout.getvalue()
loss_list = []
for line in loss_history.split('\n'):
if(len(line.split("loss: ")) == 1):
continue
loss_list.append(float(line.split("loss: ")[-1]))
plt.figure()
plt.plot(np.arange(len(loss_list)), loss_list)
plt.xlabel("Epoch")
plt.ylabel("Loss")
if self.print_loss:
print("=============== Loss Array ===============")
print(np.array(loss_list))
return True
Usage:
from sklearn.linear_model import SGDRegressor
model = SGDRegressor(verbose=1)
with DisplayLossCurve():
model.fit(X, Y)
# OR
with DisplayLossCurve(print_loss=True):
model.fit(X, Y)