Tensorflow: save the model with smallest validation error
This can be done with checkpoints. In tensorflow 1:
# you should import other functions/libs as needed to build the model
from keras.callbacks.callbacks import ModelCheckpoint
# add checkpoint to save model with lowest val loss
filepath = 'tf1_mnist_cnn.hdf5'
save_checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, \
save_best_only=True, save_weights_only=False, \
mode='auto', period=1)
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test),
callbacks=[save_checkpoint])
Tensorflow 2:
# import other libs as needed for building model
from tensorflow.keras.callbacks import ModelCheckpoint
# add a checkpoint to save the lowest validation loss
filepath = 'tf2_mnist_model.hdf5'
checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, \
save_best_only=True, save_weights_only=False, \
mode='auto', save_frequency=1)
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test),
callbacks=[checkpoint])
Complete demo files are here: https://github.com/nateGeorge/slurm_gpu_ubuntu/tree/master/demo_files.
You need to calculate the classification accuracy on the validation-set and keep track of the best one seen so far, and only write the checkpoint once an improvement has been found to the validation accuracy.
If the data-set and/or model is large, then you may have to split the validation-set into batches to fit the computation in memory.
This tutorial shows exactly how to do what you want:
https://github.com/Hvass-Labs/TensorFlow-Tutorials/blob/master/04_Save_Restore.ipynb
It is also available as a short video:
https://www.youtube.com/watch?v=Lx8JUJROkh0