-
Notifications
You must be signed in to change notification settings - Fork 105
Open
Description
Morning, David. Thank you a lot for sharing the code. But I have a question about the results (the trend of the graph). We also know in DDPG, our target is to find the best reward, the reward increases gradually until convergence – the value of reward bouncing in the same value, not continuing to increase. But your results didn't show that. Could you let me know what is your standard for the stop point? when does training processing stop? One more time, thank you so much for your help. Actually, your code helps me a lot with my study.
Metadata
Metadata
Assignees
Labels
No labels