1 min readMay 30, 2018
Hi!
You better use policy gradient for your situation use my implement with cartpole and change the environment with lunar lander it will work. (You can also add a mini batch function).
Hi!
You better use policy gradient for your situation use my implement with cartpole and change the environment with lunar lander it will work. (You can also add a mini batch function).
Developer Advocate 🥑 at Hugging Face 🤗| Founder Deep Reinforcement Learning class 📚 https://bit.ly/3QADz2Q |