Bonsai AI: Using Simulink for Deep Reinforcement Learning

Cyrill Glockner
Towards Data Science
4 min readFeb 28, 2018

--

This is the second post in our Simulation and Deep Reinforcement Learning (DRL) series. In our first post, we covered the benefits of simulations as training environments for DRL. Now, we’ll focus on how to to make simulations + DRL work.

In the example below, we will train a Bonsai BRAIN using a Simulink model. The goal is to teach the BRAIN (an AI model built in the Bonsai Platform) how to tune a wind turbine and maximize the energy output of it by keeping it turned into the wind at an optimal angle.

Simulink provides a great training environment for DRL as it allows 3rd parties like Bonsai to integrate and control simulation models from the outside. This ability is one of the basic requirements for simulation platforms to be feasible for Deep Reinforcement Learning using Bonsai AI. More requirements can be found here.

1: Simulation Model

This Simulink Wind Turbine model is provided by The MathWorks. For this scenario, it represents a simple control problem that can be solved by applying reinforcement learning.

Matlab/Simulink Wind Turbine model used for training

2: Identify Actions and State

First, we need to identify a control point within the model so Bonsai can take over inputs and outputs. We’re doing this by inserting a Bonsai block into the model, replacing the existing control block.

  • As discussed in the first post, Bonsai controls actions within the simulation model and receives state and reward in return. After running the model a large number of times, the Bonsai BRAIN has learned an optimal policy for the environment provided by the simulation.
  • In this example, the Bonsai block replaces the YAW-controller of the turbine.
  • Controls may come in all sorts of shapes and structures. They can be inputs, knobs, switches, or any other control point within a simulation model that has inputs and outputs.
‍ Bonsai control block inserted in Matlab/Simulink model

3: Connect Simulink model using the Bonsai Universal Coordinator

4: Inkling

  • Once the model is connected, users need to describe its state and action using Bonsai’s special-purpose programming language, Inkling. This can be done using the Bonsai web interface or the CLI.
Bonsai Web Interface showing Inkling code describing State and Action and curriculum

5: Training

  • Now, you can start training the model and monitor the training graph.
  • ‍During training, users may need to modify reward functions to optimize learning time and results. A great resource on writing reward functions can be found here: Bonsai Training Video.
  • Learn more about the training graph.
Bonsai BRAIN details showing training graph and status

6: Predictions

Once training has completed, you can use the trained Bonsai BRAIN to get predictions.

  • Connect the Bonsai BRAIN to your simulation model and examine the quality of predictions.
Simulator view of Yaw Angle based on Bonsai BRAIN predictions

Conclusion

Simulators are a crucial tool for reinforcement learning. Enterprises can use simulation models that reflect real-world business processes or physical realities and optimize them with Bonsai’s reinforcement learning technology. Typically, there are no changes needed to the simulation model. If you’ve missed our first post on how simulations can be used for training, please find it on our blog.

Getting Started

Bonsai can help you apply deep reinforcement learning technology and build intelligent control into your own industrial systems using Simulink as the training environment. If you are using Simulink and you want to try out Bonsai AI, join our beta program and get started here.

--

--