Thomas Simonini
1 min readMay 30, 2018

--

Hi, I think RHS means right hand side. So what you want to know is by what we multiply the delta w. It’s by the gradient : to be simple, the change of weights is defined by the gradient of the loss error. In this case our loss error is the difference between our predicted q_value and the target_q_value (aka TD error). Our change in weights is .lr * (Q_target — Q_value) * ∇w Q_value (this ∇w Q_value signify that we take the gradient of our loss with respect to w)

--

--

Thomas Simonini
Thomas Simonini

Written by Thomas Simonini

Developer Advocate 🥑 at Hugging Face 🤗| Founder Deep Reinforcement Learning class 📚 https://bit.ly/3QADz2Q |

No responses yet