Approximating dynamic models of industrial robots with neural networks

Published in

Towards Data Science

7 min readJul 20, 2017

Motion controllers for industrial robots normally generate two cyclic set values for each joint axis: a position reference value and a torque feed-forward value. The axes set positions are calculated by the kinematic model, while the motor torques are calculated by the dynamic model.

However, while kinematic models are usually simpler to implement and are based on well-known parameters, the dynamic models are more complex and are based on parameters that are often not known and must be identified (e.g. inertia, friction). Also, dynamic models might fail to accurately describe all the physical properties of the mechanical structure because not all the effects are contemplated or because some of the effects are just too difficult to model. For example, elastic behaviors of the joints are normally not taken into account.

An alternative to designing a fixed model trying to predict the reality is to let the robot learn by itself the characteristics of its own dynamics. A simple supervised learning algorithm can understand the features of different mechanical structures and successfully replace the standard implementations of dynamic models.

This technique could be applied to the following cases:

• A dynamic model for a particular kinematic structure is not known or is too complicated to describe analytically.

• The available dynamic model is too simplistic and fails to predict the reality with high accuracy.

Also, since learning algorithms can be easily trained online (i.e. while the robot is in motion), the robot could adapt to long-term unpredictable changes in the dynamics (e.g. increased friction due to mechanical wear, or modified inertia due to load shift).

Control structure

The control structure for each individual robotic axis consists of a closed-loop PID paired with an open-loop torque feed-forward.

The standard approach to calculate the feed-forward value is to solve an inverse dynamics function D-1 to calculate motor torques (τ) from the current set joints positions (α), velocity and acceleration:

Inverse dynamics

This equation is normally solved via Lagrange or Newton-Euler methods, usually assuming a rigid body structure. The model’s parameters are identified offline and never changed over time.

We now try to replace the dynamic model with a function approximator in the form of a sequential neural network.

The network learns the dynamics of the mechanical structure through a training set, which is composed by three features per axis (position, velocity and acceleration) and one target label (torque). Both offline and online training are possible.

Once the network has been trained, it can be used at real-time to predict the motor torque values, which are sent to the servo drive’s current loop as feed-forward control signals.

Controller structure including an online learning function approximator

The structure of the used neural network is very shallow and includes:

• One input layer for the set position, velocity and acceleration of each axis.

• Two fully-connected hidden layers with 100 units each, ReLU activation functions and randomly initialized weights.

• One output layer with one single linear unit per axis, representing the predicted motor torque.

Neural network architecture for a 6-axis robot

This specific network configuration requires a total of 12606 parameters for a 6-axis robot.

Training process

Training the network is currently done in a fully supervised way. Reinforcement learning techniques (e.g. policy gradient) could be applied to further optimize the quality of the outcome and will be investigated in future work.

The first learning step consists in offline training, using a large number of labeled data captured during the robot’s movement. A single data example contains three features (axes position, speed and acceleration) and one target label (motor torque) per axis.

In particular, the encoders positions and the motors torques are recorded simultaneously during the movement. The axes speed and acceleration are then calculated from the position by differentiation.

In order to make the learning process faster and uniform across the input data, all features are pre-processed in form of standard scaling: zero mean and unit variance. The example below shows data recorded for a single axis.

Recorded features of a training set for a single axis after passing through a standard scaler

The training process is implemented using an RMSprop optimizer with a starting learning rate of 1e-4. The evaluated loss function is a standard minimum squared error.

A random 10% subset of the training data is taken aside for cross-validation purposes. Common regularization techniques are then applied to decrease validation loss:

- Learning rate decay: prevents oscillations around target optima and increase accuracy towards end of learning phase.

- Network architecture: 100 units per hidden layer to compromise between network capacity and training time.

- Batch size: mini-batches of 200 examples for stochastic gradient descent.

- Early stopping: the training set is looped over for 120 epochs until the validation loss starts plateauing and training is stopped to avoid overfitting the network.

Once the optimal weights are learned offline, the network can be used for real-time prediction. Additionally, training can also continue in online form to provide adapting behavior to the network.

Torque prediction

The trained weights are used to cyclically predict the motors torques according to the input status of each axis.

Given the shallow architecture and small layer size of the network a naïve implementation of matrix multiplication is enough to run in real-time in only a few µs of computing time.

The ReLU non-linearities are also optimal for real-time computations in comparison with sigmoid and tanh alternatives.

The input data (axes position, speed and acceleration) are scaled with the same factor that is used during training and then fed to the network. The output predicted torque values are cyclically sent to the servo drives as feed-forward values to the current loop.

A torque limiter is active for safety reasons in case predicted values are too high.

Results

The neural network was tested with different dynamic structures and its torque prediction was compared to the actual motor value.

Single axis with constant inertia

The first test was performed on a single axis system with constant inertia load, and a standard FF model was taken as a comparison benchmark.

The network was trained on a small set of 2000 examples. The following figure compares the actual motor torque with the predicted values from a standard FF model and from the neural network. No feed-forward was sent to the drives in this case.

Comparison between single axis motor torque values: actual (green), standard FF (purple), NN predicted (red)

The predicted torque values are similar, although the standard FF model provides a more detailed representation of the real dynamics.

Single axis with variable inertia

The second test was also performed on a single axis system, but this time with an asymmetric vertical load, which causes the inertia to vary according to the axis position. The quality of the results is checked by observing the effects on the axis positioning error.

The following figure compares the actual motor torque with the predicted value from the neural network. No feed-forward was sent to the drives in this case.

Comparison between single axis motor torque values with variable inertia load: actual (green) and NN predicted (red)

The following step is to send the predicted torque to the drive as a feed-forward value: a reduction in the lag error is clearly visible while the axis is moving, particularly during the acceleration phase.

Comparison between axis positioning error values: without torque feed-forward (green) and after applying the NN feed-forward (red)

6-Axes robot

The last test was performed on a 6ax articulated robot with an 8Kg payload.

The neural network in this case has 18 input units and 6 output values. The system is highly non-linear as a result of the kinematic chain and the torque of each axis depends on the dynamic state of all other axes.

After a short offline training with a set of about 10000 examples, the network was already able to predict with fairly good approximation the torque values for each motor, as shown in the following figure recorded during a random movement.

Comparison between torque values of the six motors of an articulated robotic arm: actual values (green) and predicted by the NN (red)

More detailed prediction quality could likely be achieved with deeper networks and larger training sets.

Conclusion

A simple forward-propagating neural network with two hidden layers was tested as a function approximator to learn the dynamic models of different kinematical structures.

Practical results show that even with such a shallow architecture and small training sets the network can provide good enough prediction to allow for reductions in the positioning error of each axis.

The technique is clearly not meant to replace dynamic models for well-known structures, but it has a very interesting potential in several applications where models are either not available or not accurate enough.

A few advantages are:

- It can be applied to any user-defined kinematics, even highly complex ones with several joints (the capacity of the neural network might have to be increased accordingly)

- It can predict effects that are difficult to model analytically

- It can adapt to time-dependent changes in the mechanics

On the other hand, the most obvious disadvantage is that the internal structure of the network is not human-readable: any learned information about mechanical dynamics cannot be extracted.