LSTM – How to Interpret Model Weights Extracted from TensorFlow2 Keras LSTM Model?

I fitted a tensorflow.keras.layers.LSTM model and extracted the model weights via get_weights(). However, I find it hard to interpret the weights array.

To be specific, I set the model by

model = Sequential()
model.add(LSTM(128, input_shape=(10, 10)))
model.add(Dense(1))

and fit the model with the data of the following shape

train_X.shape, train_y.shape, test_X.shape, test_y.shape
## ((23, 10, 10), (23,), (6, 10, 10), (6,))

The code for model fitting:

model.fit(train_X, train_y, epochs=50, batch_size=4,
                    validation_data=(test_X, test_y), verbose=2, shuffle=True)

The shape of model weights:

[w.shape for w in model.get_weights()]
## [(10, 512), (128, 512), (512,), (128, 1), (1,)]

The math formula of LSTM:

As you can see from the formula, there are eight weight matrices and four bias vectors. However, I don't know how to match them to the weights array.

W = model.layers[0].get_weights()[0] U = model.layers[0].get_weights()[1] b = model.layers[0].get_weights()[2] W_i = W[:, :units] W_f = W[:, units: units * 2] W_c = W[:, units * 2: units * 3] W_o = W[:, units * 3:] U_i = U[:, :units] U_f = U[:, units: units * 2] U_c = U[:, units * 2: units * 3] U_o = U[:, units * 3:] b_i = b[:units] b_f = b[units: units * 2] b_c = b[units * 2: units * 3] b_o = b[units * 3:]

Best Answer

When you print

print(model.layers[0].trainable_weights)

you should see three tensors: lstm_1/kernel, lstm_1/recurrent_kernel, lstm_1/bias:0 One of the dimensions of each tensor should be a product of

4 * number_of_units where number_of_units is your number of neurons.

Try:

units = int(int(model.layers[0].trainable_weights[0].shape[1])/4)
print("No units: ", units)

That is because each tensor contains weights for four LSTM units (in that order):

i (input), f (forget), c (cell state) and o (output) Therefore in order to extract weights you can simply use slice operator:

Best Answer

Related Solutions

LSTM – Addressing Underfitting in RNN/LSTM Networks for Spectrograms: Is CNN Encoder Essential?

Related Question