Reinforcement Learning Demystified: A Gentle Introduction

Episode 1, demystifying agent/environment interaction, and the components of a reinforcement learning agent.

Mohammad Ashraf
2 min readApr 7, 2018

In a long blog post series starting with this episode, I’ll try to simplify the theory behind the science of reinforcement learning and its applications and cover code examples to make a solid illustration.

Mujoco’s Humanoids

What is Reinforcement Learning ?

Reinforcement learning or RL for short is the science of decision making or the optimal way of making decisions. When an infant plays, waves its arms, it has no explicit teacher, but it does have a direct sensorimotor connection to its environment. Exercising this connection produces a wealth of information about cause and effect, about consequences of actions, and about what to do in order to achieve goals.

Fig. 1

This is the key idea behind RL, we have an environment which represents the outside world to the agent and an agent that takes actions, receives observations from the environment that consists of a reward for his action and information of his new state. That reward informs the agent of how good or bad was the taken action, and the observation tells him what is his next state in the environment.

The agent tries to figure out the the best actions to take or the optimal way to behave in the environment in order to carry out his task in the best possible way.

Humanoid learning how to run

This is a simulation of a humanoid that learned how to run after executing the sequence of acting, observing, and then acting until it finally figured out the best action to take at each time step to achieve its task, i.e. running efficiently.

To continue reading this article, just follow this link to my new website “becomesentient.com” where I discuss all AI related topics. Thank you for your consideration.

--

--

Mohammad Ashraf

An AI research Engineer. Geek about AI and Reinforcement Learning. twitter: @MhmdElsersy, Github: Neo-47