Analyzing Chaos Theory, Machine Learning and predictability, inspired by Jurassic Park

Inspiration from Chaos in Jurassic Park:
I was reading the Jurassic Park novel from 1990 for the first time and came across this interesting passage where Dr. Ian Malcolm explains Chaos Theory and claims that prediction of chaotic systems through computers is impossible, while talking about how the desire to predict weather, a chaotic system, lead Von Neumann to create the modern (stored program) computers. Chaos is a central theme in Jurassic World series and an explanation of how things go wrong. Dr. Malcolm keeps warning John Hammond, the creator of Jurassic Park that systems like that can’t be contained and managed due to Chaos. I randomly bought a book on Chaos Theory before the novel. I had no idea that it’s a book from 1987 and the character of Dr. Malcolm was actually inspired by James Gleick, the writer of Chaos. I read that book with more interest after realizing that.
As a Data Scientist and JP fan, I decided to do more research and analysis, and evaluate how the claim holds now after the recent advancements in Machine Learning and computation, and also analyze how Chaos Theory has been applied to and influenced science during these past few decades, especially computer science and Data Science. Chaos Theory is considered among the most important concepts discovered in Science during 20th century. It is an interesting and useful concept but it is seldom talked about among data science and AI people, so I’m writing this article on it.
What’s Chaos Theory:
Chaos Theory is the study of complex, usually non-linear systems, which are highly sensitive to initial conditions and alterations in state, and have complex interactions, mixing, non-periodicity and feedback loops, which result in rapidly evolving irregular and often unexpected behavior, which is unpredictable. However there are underlying order and patterns beneath that disorder, but it is said to be unpredictable. The prediction rapidly becomes harder as you try to predict farther into the future. Sometimes apparently simpler systems exhibit chaos too.
Edward Lorenz, the founder of Chaos Theory summarizes it in an interesting way: "When the present determines the future, but the approximate present does not approximately determine the future.", meaning that the systems are deterministic but small change in initial conditions results in evolution of very different unpredictable behavior. The deterministic behavior appears to be random.
Chaos Theory is the basis of the famous Butterfly Effect, which is often mentioned in pop culture and movies. According to Lorenz: "A butterfly stirring the air in Hong Kong today can transform storm systems in New York next month."
Uses of Chaos Theory:
Chaos is common in the world and concepts from Chaos Theory have a wide range of applications. They are not only used for facilitating analysis of different chaotic systems, asserting considerations and limitations, and detecting chaos. They are also used to design, produce and improve many things, sometimes by creating chaos while exploiting initial conditions and chaotic maps. Apart from core areas of natural sciences like Physics, Chemistry and Biology, some interesting areas of application are Cryptography, Economics, Ecology, Agriculture, Astronomy, Politics, Anthropology, Physiology, Psychology, Meteorology, Geology, Optimization, Robotics & Control, Computer Science, Electrical & Telecom Engineering, Generative Art and Textile Design.
Pseudo-Random number generation is used for programming anything random and it’s very common in AI and ML. Chaotic Processes and their sensitivity to initial conditions are used to generate secure pseudo-random numbers. They are also used for Image Encryption, making Ciphers and creating Watermarks.
Some communication system and circuits use chaotic signals to achieve secure communication. Chaotic Processes are also used in home appliances and consumer electronics.
In Robotics Chaos Dynamics are used in the form of Chaos Analysis or Chaos Synthesis. Chaos Synthesis is about generating chaos to accomplish tasks. Chaos Analysis is observation and analysis of chaotic behavior. Chaos Theory is used in behavior analysis of Swarm Intelligence & multi-agent systems, Bipedal Robot locomotion and mobile robots interacting with environment. It is also used in some motion planning algorithms. Fractals are also used used in modular robotics.
Fractals are complex and often beautiful visual patterns that result from chaos and are common throughout the universe from cell membranes to solar systems, and make things beautiful and unique. They are also used in generating digital art and designing textile patterns etc.
Chaos Theory is also used in Meta-heuristic based Optimization and Evolutionary Computing, which are used in Artificial Intelligence and Robotics, as well as many engineering and industrial problems like polymer manufacturing and Telecommunications. Chaos is added to create chaotic versions of Particle Swarm Optimization and Genetic Algorithms, for avoiding local optima. Agent based modeling and simulation also utilize Chaos theory in some cases.
There are some chaos inspired model training and optimization methods in Machine Learning. There are also Neural Network Architectures inspired by Chaos, and it was found in experiments that Chaotic Neural Network process information quickly and efficiently. Stochastic Gradient Descent (SGD) and convergence of Neural Network training have chaotic behavior and recently Chaos Theory was used to improve the current understanding of Neural Network Optimization and explain why SGD works well.
Chaotic Systems, Predictive Analytics and Machine Learning:
There are lots of examples of Chaotic systems, which include pandemics, road traffic, crypto and stock markets, economy, sports, crowds, multi-agent systems, autonomous driving, consumer behavior, sales and societies, many of which I’ve also worked on for predictive analytics.
I believe that the effect of chaos and the limitations and challenges in predictions due to it should be considered and acknowledged for such systems, to avoid unrealistic expectations in making predictions, especially if you are not carefully using domain knowledge and specialized methods.
Modern computation power, availability of data and advancements in Machine Learning, do give us a hope to predict the unpredictable to some extent in chaotic systems, if they somehow learn the underlying order and patterns within the disorder. There are examples where predictive analytics do a good job. It depends on the data and methodology as well as the complexity and nature of problem.
Chaotic Systems are sensitive to initial conditions and precision of measurements, and they require having detailed information about the complete systems. Often we don’t have required or enough data, and the feature values are noisy and inaccurate. We don’t have all the features and state representation that create the complex interactions and changes in the chaotic system. Another challenge is that having more features, exponentially increases the data and computation required due to "the curse of dimensionality". Accounting for the feedback loops and emergent behavior also makes it a harder problem.
Following are some methods and ideas which are proving to be good in mitigating effect of chaos and doing decent predictions in such chaotic systems:
A simple idea that is important in Machine Learning in general too, is to do continuous training, evaluation and monitoring of the model and system. Predictive analytics of complex evolving systems should be an iterative process. Chaos Theory particularly emphasizes on the unpredictability of chaotic systems over long horizons i.e. ahead in the future, so training a new system in the future helps.
The use of Ensemble methods has been emphasized in some successful attempts at doing good long horizon predictions in systems with chaotic dynamics. In Ensemble methods, predictions from multiple models are combined to get a better prediction based on "wisdom of the crowd". Ensembles are a useful and common way to improve predictions in Machine Learning in general too. In systems with chaotic dynamics, they are especially helpful because multiple models tackle the system’s sensitivity to initial conditions, state changes and noisy measurements.
Data Assimilation methods like Kalman Filter ensembles combine new observations with model forecasts for improvement. They are also used for improving predictions, when doing numerical simulations in chaotic systems, and are sometimes used in inputs for ML models.
Feedback Loops, sequential dynamics and keeping track of state are an inherent part of Chaos Theory and Chaotic Dynamics. So Recurrent Neural Networks (RNNs) seem like a natural choice for dealing with them. Long-Short Term Memory (LSTMs) are a type of RNNs. They outperform RNNs by dealing with its issues like vanishing gradient. RNNs like LSTMs were used effectively in several researches and experiments on using ML for predictions in Chaos [1]. They can do continuously time series forecasting of future by utilizing memory and sequence of past states. Some solutions use simple Deep Neural Networks, sometimes combining them with Auto-regressive models.
Phase-Space is state of a system represented in an n-dimensional space of all possible states. The system follows trajectory in that space and that trajectory is the solution to the differential equations describing the system. Chaotic systems cannot generally be solved but their qualitative behavior and flavor of their motion in phase-space can be estimated. There is an area in the phase-space, called an Attractor, which is outlined by all trajectories and they seem attracted by it. In Strange Attractors, we don’t know where the system will be on the Attractor, due to lack of perfect information about initial conditions. Lorenz Attractor is a Strange Attractor, discovered by Lorenz during experimentation on convection. It looks like a continuously spiraling butterfly shape with no intersection in trajectories, as no state is ever repeated, showing non-periodicity and chaos. It revolutionized Chaos Theory and its plot became its symbol.
Autoencoders learn a compressed representation of data in unsupervised way. They are also utilized for state representation in chaotic systems, and are sometimes combined with RNNs. They can also learn to efficiently represent Attractors of chaotic systems. In an experiment [2] LSTM-Autoencoder with false nearest neighbor regularization was used to encode and estimate Lorenz Attractor through unsupervised learning and reconstruct the state space from the information. It was similar to using delay coordinate encoding used in non-linear dynamics for state estimation, while circumventing its limitations.
Convolutional Neural Networks (CNNs) can also be used for sequential modeling in the form of Temporal CNNs, which use 1D convolutions. The convolutions perform complex feature extraction from time series. They prove to be a good alternate to RNNs for some time series problems and are sometimes combined with them. Temporal CNNs have also been applied to prediction in chaotic systems. A research [3] combined Autoencoder with CNN and used that AE-CNN to do medium-long term predictions in chaotic hourly meteorological data, and improved it further by using transfer learning
There are also applications of Machine Learning to control and reduce Chaos in systems, especially using Reinforcement Learning. Sometimes Signal Processing techniques like wavelets are also used for that.
Some of the most impressive and latest applications of Machine Learning for prediction in chaotic systems utilize Reservoir Computing (RC). RC is based on ideas behind RNNs. This framework does data-driven prediction of the evolution of a system’s state. The reservoir is a complex network of randomly connected non-linear units that store information using recurrent loops. The reservoir learns and represents the dynamics of the system by using those recursive connections. The input is low dimensional time series data that is mapped to high dimensional time dependent vector in the reservoir using input-to-reservoir (I/R) module. The evolving output state of the dynamic system is mapped back and predicted by the reservoir-to-output (R/O) module [4]. The RC system is trained by learning limited output parameters. In RC systems Machine Learning helps in filling the gaps of missing information about the state.
RC has been applied effectively to chaotic system predictions in several research works, and different modules and improvements were added to make them better for chaotic systems. An RC system [5,6] was used to make predictions for the complex and chaotic Kuramoto-Sivashinsky system of flames propagation. The system made highly accurate and very long horizon predictions of the chaotic spatio-temporal system without prior knowledge of the system.
The use of Model-based methods for system analysis and prediction is common in science and engineering. If a good model of the system is known, then it can do better than using ML in many cases. The use of domain knowledge is very helpful in doing data science for utilizing directly as well as for feature engineering and combining with ML. So it should not always be ignored. Several methods for predictive analytics in chaotic systems, are hybrid of Machine Learning and model of the system, or combine domain knowledge with the ML algorithms.
Modern weather forecasting still uses meteorological system models and equations for forecasting. They commonly use Ensembles of predictions from different models, to mitigate the effect of having imperfect information and initial conditions. Long-term prediction of weather is still not so good and studies have shown that after 8 days, it is worst than using historical average for that date [7]. However some good work is being carried on for using ML for long-term weather prediction, and the experts working on it claim to be highly optimistic [8]. It would require time for improvement and large-scale deployment. It is interesting to note that a main objective to build modern computers is still not fully achieved, despite astonishing growth in computation power over the decades.
Coming back to Dr. Malcolm’s claim in Jurassic Park, I would say that it should be kept in mind that prediction in chaotic systems is challenging and unreliable. However some good progress is being made in doing decent predictive analytics on chaotic systems by using modern computation and advancements in Machine Learning and Deep Learning.
References:
[1] M. Modondo, T. Gibbons, "Learning and Modeling Chaos Using LSTM Recurrent Neural Networks". MICS, 2018.
[2] Sigrid Keydana, "Deep Attractors: Where deep learning meets chaos", RStudio AI Blog, 2020.
[3] Baogui Xin, "Prediction for Chaotic Time Series-Based AE-CNN and Transfer Learning", Hindawi-Complexity, 2020.
[4] H. Fan, J. Jiang, et al, "Long-term prediction of chaotic systems with Machine Learning", APS Physical Review Research, 2020.
[5] Natalie Wolchover, "Machine Learning’s ‘Amazing’ Ability to Predict Chaos", Quanta Magazine, 2018.
[6] J. Pathak, B. Hunt, et al, "Model-Free Prediction of Large Spatiotemporally Chaotic Systems from Data: A Reservoir Computing Approach", APS Physics, 2018.
[7] Matthew Cappucci, "Study says ‘specific’ weather forecasts can’t be made more than 10 days in advance", Washington Post, 2019
[8] Fred Schmude, "Will Machine Learning Make Us Rethink Chaos Theory?", StormGeo, 2021.