Thomas Simonini
1 min readMar 31, 2018

--

Hi,

Thanks for your feedback.

Good question, in fact in Deep Reinforcement Learning we don’t use model-based algorithms because it’s too costly: because for each environment, the agent needs to create a model of it (to understand how the environment behave). Model based algorithms are agents that create a representation of the environment to understand how to behave.

On the other hand, policy based and value based just do some trial and errors experiences (they don’t have a total knowing of the environment, they just learn what’s the best thing to do for each situation).

To understand better let’s take an example, imagine you play Super Mario Bros, imagine each level as an independent environment. If we use a policy based RL algorithm, we will not need to train Super Mario for each level, since our policy will be good enough generalize some important things (for instance when you see an enemy you must avoid it or jump on it).

On the other hand, a model-based reinforcement learning algorithm can’t play a new environment (a new level of Super Mario Bros) before creating a new model, even if our agent already see enemies (goombas and turtles) in another environment (level). Our agent is totally lost because their logic is based on the understanding of the environment (how the environment is, what enemies at what moment etc) not on how to behave at random situations.

Hope it helped

--

--

Thomas Simonini
Thomas Simonini

Written by Thomas Simonini

Developer Advocate 🥑 at Hugging Face 🤗| Founder Deep Reinforcement Learning class 📚 https://bit.ly/3QADz2Q |

Responses (1)