1 min readMay 30, 2018
Hi, I think RHS means right hand side. So what you want to know is by what we multiply the delta w. It’s by the gradient : to be simple, the change of weights is defined by the gradient of the loss error. In this case our loss error is the difference between our predicted q_value and the target_q_value (aka TD error). Our change in weights is .lr * (Q_target — Q_value) * ∇w Q_value (this ∇w Q_value signify that we take the gradient of our loss with respect to w)