Simplifying Reinforcement Learning Workflow in MATLAB

Solving OpenAI environment in MATLAB

Published in

Towards Data Science

5 min readMay 10, 2021

Imagine you were interested in solving a certain problem using Reinforcement learning. You have coded in your environment and you compile a laundry list of Reinforcement Learning (RL) algorithms to try. Self-implementing the algorithms from scratch is tricky and time-consuming because it requires a lot of trials and contains a lot of implementational tricks. So what do you do?

The best answer is to use an RL framework. RL framework contains near-optimal implementations of RL algorithms. The implementation of the algorithm is off-loaded to the framework and the user only needs to worry about is the neural architecture of the actor and critic models. There are a lot of frameworks based on TensorFlow and PyTorch out there. However, the Reinforcement Learning Designer app released with MATLAB 2021a is a strong contender in this category as well and this article is about that.

Typical RL loop (image from mathworks.com)

RL Designer app is part of the reinforcement learning toolbox. It is basically a frontend for the functionalities of the RL toolbox. The point and click aspects of the designer make managing RL workflows supremely easy and in this article, I will describe how to solve a simple OpenAI environment with the app. I have created a youtube series that delves into details of Reinforcement learning in MATLAB. The video version of this article is here:

RL designer apps look a bit similar to the Deep Network designer app. On the left pane, you can find Agents, Environments, Results, and Environment details. You can perform the entire RL workflow from within the app itself. Since we want to make things a bit challenging we will first demonstrate how to load in an external environment and train a DQN on it using our custom network.

You would need Python and OpenAI-gym package to be able to load in the environment. Let's begin,

Loading Environment

MATLAB R2021a ships with a few pre-built environments and they can be loaded in by clicking the ‘New’ button in the Environment tab location. In this article, we will try to load in our custom environment which is basically a wrapper for the Moutain-Car-v0 environment from OpenAI. In the following code, we define the wrapper for the gym environment. The ‘step’ function performs the step on the gym environment and returns the details in MATLAB-friendly format.

OpenAI wrapper

An object of this class needs to created in the workspace and then the environment would be ready to be imported inside the RL Designer app.

Selecting Agent and Model

Just like selecting an environment the selection of agents is also done by selecting the ‘New’ button in the Agent tab area. The app gives a list of algorithms to choose from. We will choose DQN for this task.

Once we select an agent, MATLAB creates a default actor/critic neural network with fully connected layers. We would like to modify this and use our own custom neural network. This can be done by selecting the agent and importing our custom critic network from the workspace. We create a simple linear network using the following script and load it into the workspace.

Custom network for DQN

We first load in the network to our MATLAB workspace and then we finally load it into the RL designer app by selecting the ‘Import’ option from the Agent tab.

We can also analyze and edit the network using the Deep Network designer app. The network architecture could be arbitrary. The only restriction on the neural network is that the input and output should match the algorithm and environmental repairments. We also specify the hyperparameters in the main window directly. We can also set the behavior of the target network directly from this screen. The app automatically takes care of all these minor details.

Setting the hyperparameters of the agent

Training Agent

Training can be initiated by clicking the ‘Train’ button on the main menu bar. We get to specify the episode details and the averaging details before starting the process. The training statistics looks like the following:

This is a pretty standard agent training window. Once the training is completed you can save the agent and the network. The saved agent and be retrained or used in simulating the performance.

Validating

We can directly simulate the results of our trained agent within the app. The number of simulation episodes can be set in the ‘Simulation’ tab. You can also load in pre-trained agents and simulate them to get a comparison of different agents. Keep in mind that the simulation step only keeps track of the final scores of any given environment. If visualization is needed you would need to simulate the environment manually using scripts.

Final Words

We discussed the complete RL workflow in MATLAB. Changing the agent algorithm is pretty seamless and this is the main selling point of this app. I would like to emphasize additional functionalities can be found in the RL toolbox (using scripts), but for a majority of users, the functionality present in the app should be sufficient. The biggest advantage of this app and framework is that it abstracts away all the implementation details to give a seamless experience. If you are already inside the MATLAB ecosystem give it a shot. Cheers!

References

Code for this article: GitHub link
RL Playlist: Youtube link
Reinforcement Learning Toolbox documentation

Simplifying Reinforcement Learning Workflow in MATLAB

Solving OpenAI environment in MATLAB

Written by Sunny Guha