1 min readJun 15, 2018
Hello,
Indeed in reality we are in stochastic environments but Taxi v-2 is a deterministic environment (if you choose to go left, you’ll go left) that’s why I didn’t considered the stochastic aspect because in this environment it doesn’t exists.
On the other hand, for the frozen lake implementation also in this article, we are in a stochastic environment. But remember that in stochastic environments the best is to work with Policy gradients.