Using TFRecords with keras
Change:
"label": tf.FixedLenSequenceFeature([1]...
into:
"label": tf.FixedLenSequenceFeature([]...
This is unfortunately not explained in the documentation on the website, but some explanation can be found in the docstring of FixedLenSequenceFeature
on github. Basically, if your data consists of a single dimension (+ a batch dimension), you don't need to specify it.
You have forget to this line from the example:
parsed_features = tf.parse_single_example(proto, f)
Add it to _parse_function
.
Also, you can return just the dataset
object. Keras supports iterators as well as instances of the tf.data.Dataset
. Also, it looks a bit weird to shuffle and repeat first, and then to parse tfexamples. Here is an example code that works for me:
def dataset(filenames, batch_size, img_height, img_width, is_training=False):
decoder = TfExampleDecoder()
def preprocess(image, boxes, classes):
image = preprocess_image(image, resize_height=img_height, resize_width=img_width)
return image, groundtruth
ds = tf.data.TFRecordDataset(filenames)
ds = ds.map(decoder.decode, num_parallel_calls=8)
if is_training:
ds = ds.shuffle(1000 + 3 * batch_size)
ds = ds.apply(tf.contrib.data.map_and_batch(map_func=preprocess, batch_size=batch_size, num_parallel_calls=8))
ds = ds.repeat()
ds = ds.prefetch(buffer_size=batch_size)
return ds
train_dataset = dataset(args.train_data, args.batch_size,
args.img_height, args.img_width,
is_training=True)
model.fit(train_dataset,
steps_per_epoch=args.steps_per_epoch,
epochs=args.max_epochs,
callbacks=callbacks,
initial_epoch=0)
It seems like an issue with your data or preprocessing pipeline, rather than with Keras. Try to inspect what you are getting out of the dataset with a debugging code like:
ds = dataset(args.data, args.img_height, args.img_width, is_training=True)
image_t, classes_t = ds.make_one_shot_iterator().get_next()
with tf.Session() as sess:
while True:
image, classes = sess.run([image_t, classes_t])
# Do something with the data: display, log etc.