Deep Reinforcement Learning Course v2.0

Chapter 2 of the Deep Reinforcement Learning Course v2.0

Image for post
Image for post

This article is the second part of Chapter 2 of the Deep Reinforcement Learning Course v2.0🕹️. A free course from beginner to expert with Tensorflow and PyTorch. Check the syllabus here.

In the first part of this second chapter of this course, we learned about the value-based methods and the difference between Monte Carlo and Temporal Difference Learning.

So, in the second part, we’ll study Q-Learning, and implement our first RL Agent: a Q-Learning autonomous taxi that will need to learn to navigate in a city to transport its passengers from point A to point B.

This chapter is fundamental if you want to be able to work on Deep Q-Learning (chapter 3): the first Deep RL algorithm that was able to play Atari games and beat the human level on some of them (breakout, space invaders…). …


Deep Reinforcement Learning Course v2.0

Chapter 2 of the Deep Reinforcement Learning Course v2.0

Image for post
Image for post

This article is part of the Deep Reinforcement Learning Course v2.0🕹️. A free course from beginner to expert with Tensorflow and PyTorch. Check the syllabus here.

Before studying this new chapter, you should master the elements that we spoke about in the first chapter. If it’s not the case, you can check the video version here or the article version there.

In the first chapter of this course, we learned about what is Reinforcement Learning (RL), the RL process, and the different methods to solve an RL problem.

So today, we’re going to dive deeper into one of these methods: value-based-methods and learn about our first RL algorithm: Q-Learning. …


Chapter 1 of the Deep Reinforcement Learning Course v2.0

Image for post
Image for post

This article is part of Deep Reinforcement Learning Course. A free course from beginner to expert. Check the syllabus here.

Welcome to the most fascinating topic in Artificial Intelligence: Deep Reinforcement Learning.

Deep RL is a type of Machine Learning where an agent learns how to behave in an environment by performing actions and seeing the results.

Since 2013 and the Deep Q-Learning paper, we’ve seen a lot of breakthroughs. From OpenAI five that beat some of the best Dota2 players of the world, to the Dexterity project, we live in an exciting moment in Deep RL research.

Image for post
Image for post
OpenAI Five, an AI that beat some of the best Dota2 players of the world

Moreover, since the first version of this course in 2018, a ton of new libraries (TF-Agents, Stable-Baseline 2.0…) and environments where launched: MineRL (Minecraft), Unity ML-Agents, OpenAI retro (NES, SNES, Genesis games…). You have now access to so many amazing games to build your agents.


A free course in Deep Reinforcement Learning from beginner to expert.

Image for post
Image for post
Some of the agents you’ll implement during this course

🎉 I’m happy to announce the launch of the new version of the Deep Reinforcement Learning Course 🥳, a free course from beginner to expert where you learn to master the skills and architectures you need, to become a deep reinforcement learning expert with Tensorflow and PyTorch.

Since the launch of the first version in 2018, we had more than 40,000 claps, 2,500 GitHub stars.

Since then, a lot of breakthroughs happened in Deep RL. New libraries were published and some of our implementations become obsolete.


Unity-ML Agents Course

Train an agent to get the golden statue in this dangerous environment.

Image for post
Image for post

This article is the third chapter of a new free course on Deep Reinforcement Learning with Unity. Where we’ll create agents that learn to play video games using the Unity game engine 🕹. Check the syllabus here.

If you never study Deep Reinforcement Learning before, you need to check the free course Deep Reinforcement Learning with Tensorflow.

In the last two articles, you learned to use ML-Agents and trained two agents. The first was able to jump over walls, and the second learned to destroy a pyramid to get the golden brick. It’s time to do something harder.

When I was thinking about creating a custom environment, I remembered the famous scene in Indiana Jones, where Indy needs to get the golden statue and avoid a lot of traps to survive. …


Unity-ML Agents Course

Train a curious agent to destroy Pyramids.

Image for post
Image for post
Unity ML-Agents

This article is the second chapter of a new free course on Deep Reinforcement Learning with Unity. Where we’ll create agents with TensorFlow that learn to play video games using the Unity game engine 🕹. Check the syllabus here.

If you never study Deep Reinforcement Learning before, you need to check the free course Deep Reinforcement Learning with Tensorflow.

Last time, we learned about how Unity ML-Agents works and trained an agent that learned to jump over walls.

Image for post
Image for post

This was a nice experience, but we want to create agents that can solve more complex tasks. So today we’ll train a smarter one that needs to press a button to spawn a pyramid, then navigate to the pyramid, knock it over, and move to the gold brick at the top.


Unity-ML Agents Course

Train a reinforcement learning agent to jump over walls.

Image for post
Image for post

This article is part of a new free course on Deep Reinforcement Learning with Unity. Where we’ll create agents with TensorFlow that learn to play video games using the Unity game engine 🕹. Check the syllabus here.

If you never study Deep Reinforcement Learning before, you need to check the free course Deep Reinforcement Learning with Tensorflow.

The past few years have witnessed breakthroughs in reinforcement learning (RL). …


Image for post
Image for post

Last time, we learned about curiosity in deep reinforcement learning. The idea of curiosity-driven learning is to build a reward function that is intrinsic to the agent (generated by the agent itself). That is, the agent is a self-learner, as he is both the student and its own feedback teacher.

To generate this reward, we introduce the intrinsic curiosity module (ICM). But this technique has serious drawbacks because of the noisy TV problem, which we’ll introduce here.

So today, we’ll study curiosity-driven learning through random network distillation used in the paper Exploration by Random Network Distillation, and you’ll learn how to implement a PPO agent playing Montezuma’s Revenge with only curiosity as reward. …


Image for post
Image for post

In the last few years, we’ve seen a lot of breakthroughs in reinforcement learning (RL). From 2013 with the first deep learning model to successfully learn a policy directly from pixel input using reinforcement learning to the OpenAI Dexterity project in 2019, we live in an exciting moment in RL research.

Today, we’ll learn about curiosity-driven learning methods, one of the most promising series of strategies in deep reinforcement learning. Thanks to this method, we were able to successfully train an agent that wins the first level of Super Mario Bros with only curiosity as a reward.

Image for post
Image for post
https://openai.com/blog/learning-dexterity/

Remember that RL is based on the reward hypothesis, which is the idea that each goal can be described as the maximization of the rewards. However, the current problem with extrinsic rewards (i.e., rewards given by the environment) is that this function is hard coded by a human, which is not scalable to real world problems (such as designing a good reward function for autonomous vehicles). …


As Deep Reinforcement Learning is becoming one of the most hyped strategies to achieve AGI (aka Artificial General Intelligence) more and more libraries are developed. And choosing the best for your needs can be a daunting task…

Image for post
Image for post

In recent years, we’ve seen an acceleration of innovations in Deep Reinforcement learning. Examples include beating the champion of the game Go with AlphaGo in 2016, OpenAI and the PPO in 2017, the resurgence of curiosity-driven learning agents in 2018 with UberAI GoExplore and OpenAI RND, and finally, the OpenAI Five that beats the best Dota players in the world.

Image for post
Image for post
OpenAI Five

Consequently, a lot of deep reinforcement learning libraries have been developed and it can be hard to choose the best library.

About

Thomas Simonini

Deep Reinforcement Learning Engineer 🤖 | Founder Deep Reinforcement Learning course 📚 bit.ly/34fMhwc | I make AI for video

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store