How can I import the MNIST dataset that has been manually downloaded?
Well, the keras.datasets.mnist
file is really short. You can manually simulate the same action, that is:
- Download a dataset from https://s3.amazonaws.com/img-datasets/mnist.pkl.gz
.
import gzip f = gzip.open('mnist.pkl.gz', 'rb') if sys.version_info < (3,): data = cPickle.load(f) else: data = cPickle.load(f, encoding='bytes') f.close() (x_train, _), (x_test, _) = data
You do not need additional code for that but can tell load_data
to load a local version in the first place:
- You can download the file https://s3.amazonaws.com/img-datasets/mnist.npz from another computer with proper (proxy) access (taken from https://github.com/keras-team/keras/blob/master/keras/datasets/mnist.py),
- copy it to the the directory
~/.keras/datasets/
(on Linux and macOS) - and run
load_data(path='mnist.npz')
with the right file name
Keras file is located into a new path in Google Cloud Storage (Before it was in AWS S3):
https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
When using:
tf.keras.datasets.mnist.load_data()
You can pass a path
parameter.
load_data()
will call get_file()
which takes as parameter fname
, if path is a full path and file exists, it will not be downloaded.
Example:
# gsutil cp gs://tensorflow/tf-keras-datasets/mnist.npz /tmp/data/mnist.npz
# python3
>>> import tensorflow as tf
>>> path = '/tmp/data/mnist.npz'
>>> (train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data(path)
>>> len(train_images)
>>> 60000
- Download file
https://s3.amazonaws.com/img-datasets/mnist.npz
- Move
mnist.npz
to.keras/datasets/
directory Load data
import keras from keras.datasets import mnist (X_train, y_train), (X_test, y_test) = mnist.load_data()