How does Keras load_data() know what part of the data is the train and test set?
The best way to find out is looking at Kera's code:
def load_data(path='mnist.npz'):
path = get_file(path, origin='https://s3.amazonaws.com/img-datasets/mnist.npz', file_hash='8a61469f7ea1b51cbae51d4f78837e45')
with np.load(path, allow_pickle=True) as f:
x_train, y_train = f['x_train'], f['y_train']
x_test, y_test = f['x_test'], f['y_test']
return (x_train, y_train), (x_test, y_test)
You can see basically is downloading a file which contains the dataset, which is already separated in train and test data.
The only parameter (path
) is basically where to store the downloaded dataset.