How do ANNs learn?
The learning ability of artificial neural networks ("ANNs") falls under the scientific area of machine learning. Machine learning is a generic term for the artificial generation of knowledge from experience. More specific, an ANN learns from historical examples and can generalize these after the learning phase by learning the patterns contained in the examples. In machine learning, there are three learning paradigms. These include supervised and unsupervised learning as well as reinforced learning.¹
Supervised learning
In supervised learning, the ANN defines a target function that predicts target values as accurately as possible. The ANN does so by creating a target function that assigns the presumed output value to each input value. It then compares its output values with given target values and adjusts its target function until it achieves the desirable accuracy. This method is hence based on a scenario with a prior defined outcome.² Here is an example:
Assuming, we play a video game where we steer a ship in harbor. The major objective of the game is to not sink the ship while you are in the harbor or conquer all succeeding levels (each level will be more difficult). The secondary objective is to collect as much gold on the way as possible. Gold always respawns and the game ends when you finished the last level.⁴
In supervised learning, the AI will try to find a mathematical function that enables it to finish all levels while collecting the most amount of gold. The AI achieves this goal by starting with a simple function that is constantly failing and sinking the ship. However, every time it fails, it checks the game’s optimal path and adjusts its function to get as close to the optimum as possible. As the provided target values end with finishing the game with the maximum amount of gold, the AI will always try to end the game as well.⁴
Unsupervised learning
In unsupervised learning, the ANN identifies input patterns without access to previously known target values or rewards. The AI orients himself on the similarity of input values and adapts the weights accordingly. The results of the learning process can therefore not be compared with known results.² Let us have a look at how that might change our example:
Our AI is still playing the same game. The primary objective is staying alive while the secondary objective is to collect as much gold as possible. However, this time the AI does not know the optimal path to beat the game. It has no target values to compare its approach and tell it that it has to finish the game.⁴
In unsupervised learning, the AI is passed one or more objectives but is not told how to achieve them. In result, the AI creates a target function and tries to optimize it on its own to achieve its given objectives. Consequently, the outcome of unsupervised learning can become unpredictable.⁴
In this case, one possible outcome could be that the AI learns to steer the ship and collects gold in the harbor. It goes to the next levels but decides that the risk-return-ratio is not high enough. The AI, therefore, decides to stay in the harbor and cruise in it for eternity while constantly collecting respawning gold. This way, it achieves its objectives in an optimal way: the ship never sinks and the AI collects an infinite amount of gold.⁴
Reinforced learning
In reinforced learning, the AI independently learns a strategy to maximize rewards received. The AI is not shown which actions are best in which situation. It receives a reward at certain times, which can also be negative. Based on this reward, the AI then estimates a utility function that determines the value of concrete action.² In relation to our example, we can conclude the following:
The AI plays the same game with the same rules. However, this time it is not given any objectives. It doesn’t know what it is supposed to do but it receives two feedbacks from the game: It either dies and loses all (negative reward) or it stays alive and gold accumulates (positive reward).⁴
Similar to unsupervised learning, the outcome of reinforced learning can be unpredictable as it is difficult to tell how the AI will react to the presented rewards. In the case of our game scenario, the outcome could be the same as for unsupervised learning:⁴
The AI starts cruising in the harbor, sinks the ship, and starts collecting gold. It will try to go to the next levels but will find that the negative rewards are being increased while positive rewards stay the same. In result, it will try to minimize the negative rewards by staying in the lowest level forever.⁴
Conclusion
Supervised learning is one of the most commonly used learning approaches for ANNs.³ In supervised learning, an AI learns by forming a target function that tries to predict values as similar as possible in relation to the target values it is presented. If you want to know more about supervised learning in difference to unsupervised learning, you can continue reading here.
Even though the outcome in our examples might be the same for unsupervised learning and reinforced learning, the two learning approaches are fundamentally different. While unsupervised learning comes to a conclusion based on objectives, reinforced learning focuses on maximizing/minimising rewards. If you are interested to learn more about reinforced learning, I recommend you to read this article.
Sources
- Reitmaier. 2015. Aktives Lernen für Klassifikationsprobleme unter der Nutzung von Strukturinformationen, 18–167. Kassel: Universität Kassel.
- Mohri, Rostamizadeh and Talwalkar. 2012. Foundations of Machine Learning, 21–112. Cambridge: MIT Press.
- Lago, de Ridder and de Schutter. 2018. Forecasting spot electricity prices: Deep learning approaches and empirical comparison of traditional algorithms, 403. Amsterdam: Elsevier
- Fridman, Lex. "MIT 6.S094: Introduction to Deep Learning and Self-Driving Cars." January 16, 2017. Educational video, 38:45. https://youtu.be/2UElC_YZ0Eo.