-
Notifications
You must be signed in to change notification settings - Fork 280
Description
Hi,
First of all, thank you very much for your easy-to-follow implementation! Very intuitive and simple. 👍
My question is about your use of LSTMCell to implement the recurrent version of A3C.
As it sounds from your code, you have used single timestep data (in your batch feeding to the model) to update your model including (the LSTMCell part) however, if it was intended to implement a recurrent procedure then you should have performed forward pass of the LSTMCell on each timestep in your recurrent part but it's missing thus, hidden states/cell states of the LSTMCell does not contribute to the gradient flow.
Simply put, I mean such a part is missing:
for i in inputs:
# Step through the sequence one element at a time.
# after each step, hidden contains the hidden state.
out, hidden = lstm(i, hidden)Otherwise, hx and cx should not have been detached:
Line 42 in 48d9584
| cx = cx.detach() |
Line 43 in 48d9584
| hx = hx.detach() |
Am I right? Or I'm missing something?
Thank you in advance.