From high-school physics to GANs: essentials for mastering generative machine learning [1/2]

Alex Honchar
Towards Data Science
7 min readJan 15, 2021

--

Wave motion illustration http://animatedphysics.com/insights/modelling-photon-phase/

GANs and other generative machine learning algorithms are still hyped and work fantastically with images, texts, and sounds. They’re able not only to generate data for fun but solve important theoretical issues and boost production ML pipelines. Unfortunately, typical today’s practical use case is limited to “fine-tune pre-trained StyleGAN2 for zombies generation”. And what’s worse, almost no one cares to explain why do we need generative modeling in the real world and where the roots of such need are coming from.

The following two articles aim to bridge the gap between modern cool stuff and slightly forgotten old-school mathematical modeling, where there were no big datasets and powerful neural networks. After grasping this material, you will be able to:

  • Understand why generative machine learning is a more powerful data modeling paradigm that was with us for centuries
  • Learn and implement every modern generative model faster, knowing the scientific fundamentals and needs for such a process
  • Formulate novel scenarios for generative modeling and get additional value for the R&D processes or to the final product

As always, the source code for all the experiments, you can find on my Github.

Classical mechanics as generative modeling

Illustration of motion of the undamped oscillator, i.e. pendulum in our case. Image from https://www.acs.psu.edu/drussell/Demos/phase-diagram/phase-diagram.html

Let’s remember physics classes. The very first topics you were taught were force, motion, time, velocity, acceleration in the simple 1D cases. Even before learning differentiation or integration in calculus classes, you could easily calculate the velocity of the object based on time and related positions. The same strategy could be applied to more complicated mechanics as spring force, pendulum, multi-dimensional mechanics, etc. You only substitute formulas for the appropriate ones and re-do the calculus routines. The simplified process of such modeling would be the following:

  • Identify what object is moving, what are the positions, the origin of the coordinate system, initial conditions
  • Find the appropriate model that describes the behavior of the given object under the given conditions
  • Solve the equation of the model with the given conditions and find velocity, acceleration, or another variable
  • Analyze the solution and its validity

For example, in the case of a pendulum system (on illustrations above), you can define the object dynamics model as an equilibrium between its kinetic and potential energy in a Lagrangian, and if you solve it for a single degree of freedom (the angle of the oscillation), you get the equation of the motion trajectory, which you can solve for different conditions.

And now the coolest part:

The same as we sample faces, cats, and songs with GANs, we could sample complex motions of physical objects movements just via solving equations. For centuries. It literally got us to the moon. Without gigabytes of data and GPUs for deep neural nets.

I bet that they didn’t teach you this angle in the physics class. If you have the exact formula for this pendulum, you have your “pendulum-GAN”: you just need to sample length, gravity, amplitude, etc and insert them into the formula: this way you can generate as many pendulums as you want. The only difference that GAN has some quasi-random vector as the input and the formula is a black-box neural net trained with data. Below are a couple of illustrations of sampled trajectories and the code you can find here.

Our generative models have to be able to sample trajectories that will look somewhat like this

Generative modeling as a function approximation

A neural ODE network iteratively approximating another oscillation behavior. Can we do the same with simpler generative models? From my other blog post: https://towardsdatascience.com/neural-odes-breakdown-of-another-deep-learning-breakthrough-3e78c7213795

Scientists also observe the data but create a formula with their heads. Knowing that neural networks are universal approximators, we can train them to approximate these formulas from the data. Let’s conduct an experiment, where we don’t know the physical properties of the pendulum, and we want to learn a black-box formula from observations with different rope lengths, ball masses, angles, etc. We can generate our trajectories step-by-step, having the step number and above-mentioned properties as the input, and trajectory point as the output of our model.

More details and the implementation are here. To simulate “real-world” conditions and add uncertainty to the observations, we will add different noises to the data, check more details in the source code. As we can see, we can successfully generate pendulum trajectories with different angles, rope lengths, etc without knowing the exact mathematical model, just observing the noisy data (as it happens in real science):

Noisy training data example on the left-most image, pendulum trajectory with theta=1, omega=1, m=1, l=1 and its approximation in the middle, and pendulum trajectory with theta=0.5, omega=0.5, m=1.5, l=1.5 on the right-most image. Predictions are done with a deterministic neural network

Interesting to notice, how with increasing the noise influence, the approximation accuracy is getting worse, as expected. However, in statistical learning, we say that we really have learned how to generate data only if we have learned data distribution from observed samples. Our approximation is just a deterministic function, that has no properties of distribution, we cannot sample different trajectories from it.

Bayesian generative modeling

Example of regression with Gaussian Processes — a Bayesian machine learning algorithm, that allows modeling uncertainty in the places, where we have not enough data and cannot be confident about a prediction. Image from https://jessicastringham.net/2018/05/18/Gaussian-Processes/

In simple words, every sample (even for the same step of our pendulum) can be slightly different because we have introduced some noise in our data which is a reason for our model to be uncertain about every step-by-step prediction. Then, we need to capture this noise in the model.

We can turn both the outputs of the model and the weights of the model from deterministic point estimates to the distributions, we can sample from. The outputs’ noise is called aleatoric uncertainty, and one of the weights — epistemic. The same as in the previous example, we generate our trajectory step-by-step, however, on every step now we can sample several potential options that might fit trajectory based on the uncertainty level:

Noisy training data example on the left-most image, pendulum trajectory with theta=1, omega=1, m=1, l=1 and its approximation in the middle, and pendulum trajectory with theta=0.5, omega=0.5, m=1.5, l=1.5 on the right-most image. Predictions are done with a Bayesian neural network, generated trajectories have grey color, 5 of them sampled per image

I’ve implemented the same neural network as in the approximating case, but with the help of TensorFlow Probability could easily turn it into a Bayesian neural network. Let me know if you want an additional tutorial in Bayesian machine learning! As you can see, the result now is the same as we observe with the “GAN framework”: we took the data, modeled it and we are able to sample different realistic samples. We can also notice, that stochastic generative modeling actually has actually better accuracy within the certainty intervals. What it has to do with actual GANs, VAEs, and other generative models, we will review in the next blog post.

Takeaways

This short introduction aims to demonstrate with a simple example, that:

  • “Generating” things is a natural idea of science for centuries: the equations we used to solve to answer the questions about real-life processes are derived from mathematical models that can “sample” those processes the same as GANs
  • In a data-driven world, we want to create these models “automatically”, without defining exact formulas and their parameters as scientists do, but still to be able to model real-world objects and solve the equations
  • We can trick our way to it and make “predictive” models that will approximate step-by-step solutions of our mathematical models of physical objects, however, we can do it in a more elegant way

The next article will take it from here and will show GANs and VAEs that actually act as mathematical models of physical processes and, I hope, this will motivate you to study and use generative models in many more complex and interesting industrial and research scenarios.

P.S.
If you found this content useful and perspective, you can support me on Bitclout. Follow me also on Facebook for AI articles that are too short for Medium, Instagram for personal stuff, and Linkedin! Contact me if you want to collaborate on interpretable AI applications or other ML projects.

--

--