Using Keras, How can I load weights generated from CuDNNLSTM into LSTM Model?

The reason is that the CuDNNLSTM layer has a bias twice as large as that of LSTM. It's because of the underlying implementation of cuDNN API. You can compare the following equations (copied from cuDNN user's guide) to the usual LSTM equations:

cuDNN LSTM equations

CuDNN uses two bias terms, so the number of bias weights is doubled. To convert it back to what LSTM uses, the two bias terms need to be summed.

I've submitted a PR to do the conversion and it's merged. You can install the latest Keras from GitHub and the problem in weight loading should be solved.

Just to add to @Yu-Yang's answer above, the latest Keras will automatically convert the CuDMMLSTM weights to LSTM, but it won't change your .json model architecture for you.

To run inference on LSTM, you'll need to open the JSON file, and manually change all instanced of CuDNNLSTM to LSTM. Then run model_from_json to load your model, and load_weights to load your weights.

I'd tried running load_weights without manually changing the CuDNNLSTM model at first, and got a bunch of errors.

Using Keras, How can I load weights generated from CuDNNLSTM into LSTM Model?

Tags:

Python

Neural Network

Tensorflow

Keras

Cudnn

Related

Recent Posts