The world’s leading publication for data science, AI, and ML professionals.

Unraveling the Design Pattern of Physics-Informed Neural Networks: Part 04

Leveraging gradient-enhanced learning to improve PINN training efficiency

Photo by Hassaan Qaiser on Unsplash
Photo by Hassaan Qaiser on Unsplash

Welcome to the 4th blog of this series, where we continue our exciting journey of exploring Design Patterns of physics-informed neural networks (PINN)🙌

In this blog, we will check out a research paper that proposed a new variant of PINN called gradient-enhanced PINN. More concretely, we will look into the problem, ** the solution, the benchmark, as well as the strengths & weaknesse**s, to distill the design pattern proposed by the paper.

As this series continues to expand, the collection of PINN design patterns grows even richer🙌 Here’s a sneak peek at what awaits you:

PINN design pattern 01: Optimizing the residual point distribution

PINN design pattern 02: Dynamic solution interval expansion

PINN design pattern 03: training PINN with gradient boosting

PINN design pattern 05: automated hyperparameter tuning

PINN design pattern 06: Causal PINN training

PINN design pattern 07: Active learning with PINN

Let’s get started!


1. Paper at a glance 🔍

  • Title: Gradient-enhanced physics-informed neural networks for forward and inverse PDE problems
  • Authors: J. Yu, L. Lu, X. Meng, G. E. Karniadakis
  • Institutes: St. Mark’s School of Texas, University of Pennsylvania, Brown University
  • Link: arXiv

2. Design pattern 🎨

2.1 Problem 🎯

In practice, it is commonly observed that conventional PINNs tend to have limited accuracy even with many training points, especially when dealing with challenging PDEs with stiff solutions. This limitation impacts the current state of research and applications in PINNs by restricting their effectiveness in solving diverse forward and inverse PDE problems with high precision.

PINN workflow. Conventional PINNs usually limited accuracy even with many residual points. One promising way to boost PINNs' accuracy is by training PINNs with gradient-enhanced learning algorithms. (Image by this blog author)
PINN workflow. Conventional PINNs usually limited accuracy even with many residual points. One promising way to boost PINNs’ accuracy is by training PINNs with gradient-enhanced learning algorithms. (Image by this blog author)

2.2 Solution 💡

One promising way to boost PINNs’ accuracy is by adopting a gradient-enhanced learning approach to train PINN.

Gradient-enhanced learning has been proven to be useful in traditional Machine Learning [2]. As shown in the illustration below, in addition to the usual input-output pair (x, y), gradient-enhanced learning also incorporates the known value of the function gradient dy/dx as an extra supervision signal. This type of learning can be effective if the gradient information can be cheaply obtained (e.g., analytically available, easily measured, etc.)

Convention learning (upper) only requires the model predictions at x to match with true function value f(x); Gradient-enhanced learning (lower) additionally requires that the derivative of the model predictions at x matches with known gradient value df(x)/dx. (Image adapted from Wikipedia)
Convention learning (upper) only requires the model predictions at x to match with true function value f(x); Gradient-enhanced learning (lower) additionally requires that the derivative of the model predictions at x matches with known gradient value df(x)/dx. (Image adapted from Wikipedia)

The same idea can also be applied to PINN training, as demonstrated in the paper.

Take a simple 2D Laplace’s equation (∂²u/∂x² + ∂²u/∂y² = 0) as an example, when employing a PINN to solve the equation, we would enforce the PDE residual f to be zero, where f = ∂²u/∂x² + ∂²u/∂y². f essentially measures if the prediction aligns with the governing equation and constitutes the PDE loss term in the overall loss function used to train PINN.

In gradient-enhanced PINN (gPINN), we can additionally enforce the derivatives of the PDE residual to be zero as well:

The rationale to do this is simple: because f is zero across the entire simulation domain, we know that the gradients of f are also zero. As a result, we would have two additional loss terms besides the usual PDE loss f = 0.

2.3 Why the solution might work 🛠 ️

The key that could make gPINN effective is the fact that the gradient provides additional information about the function’s behavior. As a result, it can guide the learning process more effectively. This feature is known in the domain of traditional machine learning, but the current paper shows that the same benefits can also be gained for PINN training.

2.4 Benchmark ⏱️

The paper considered a total of 6 different benchmark problems, including 2 forward problems, 2 inverse problems, and 2 PDEs with stiff solutions addressed by the standard gPINN and gPINN+RAR (residual-based adaptive refinement for sampling residual points):

  • 1D Poisson equation (forward problem, solved with standard gPINN): the Poisson equation is a fundamental partial differential equation in mathematical physics relating the distribution of matter in a system.
Dirichlet boundary conditions: u( x=0 ) = 0, u( x=π ) = π
Dirichlet boundary conditions: u( x=0 ) = 0, u( x=π ) = π
  • Diffusion-reaction equation (forward problem, solved with standard gPINN): this equation models reactions combined with the diffusion of substances. The forward problem here involves predicting the concentration of substances given initial conditions and reaction rates.
D=1 (diffusion coefficient)
D=1 (diffusion coefficient)
R is the chemical reaction
R is the chemical reaction
Initial and boundary conditions
Initial and boundary conditions
  • Brinkman-Forchheimer equation (inverse problem to identify the effective viscosity νₑ and permeability K, solved with standard gPINN): this equation describes flow in porous media, which is prevalent in fields such as oil recovery and groundwater flow.
Boundary condition: u(0) = u(1) = 0, H=1, ν=1e-3, ε=0.4, g=1
Boundary condition: u(0) = u(1) = 0, H=1, ν=1e-3, ε=0.4, g=1
  • 1D diffusion-reaction system (inverse problem to identify the space-dependent reaction rate k(x), solved with standard gPINN): similar to the second problem, this is also a diffusion-reaction equation.
Diffusion coefficient λ = 0.01, f = sin(2πx). A separate neural network is used to approximate k, in addition to the network to predict u.
Diffusion coefficient λ = 0.01, f = sin(2πx). A separate neural network is used to approximate k, in addition to the network to predict u.
  • 1D Burgers equation (forward problem, solved with gPINN+RAR): this is a fundamental equation in fluid dynamics, combining non-linear convection and diffusion.
Initial condition: u(x, 0) =-sin(πx), boundary conditions: u(-1, t) = u(1, t) = 0, ν=0.01/π
Initial condition: u(x, 0) =-sin(πx), boundary conditions: u(-1, t) = u(1, t) = 0, ν=0.01/π
  • Allen-Cahn equation (forward problem, solved with gPINN+RAR): this equation models the process of phase separation, which is crucial in materials science.
Initial condition: u(x, 0) = x² cos(πx), boundary conditions: u(-1, t) = u(1, t) = -1, D = 0.001
Initial condition: u(x, 0) = x² cos(πx), boundary conditions: u(-1, t) = u(1, t) = -1, D = 0.001

The benchmark studies yielded that:

  • the proposed gradient-enhanced PINN learning (gPINN) achieved higher accuracy with fewer residual points compared to conventional PINN;
  • gPINN combined with advanced residual points sampling schemes (e.g., RAR) delivered the best performance for challenging PDEs.

2.5 Strengths and Weaknesses ⚡

Strengths 💪

  • Improved accuracy of not only the function predictions but also the function derivatives predictions.
  • Faster convergence rate.
  • Performs better than traditional PINN with fewer training points.
  • Suitable for both forward and inverse problems.
  • Can be easily combined with advanced residual points sampling schemes (see previous blog) to further enhance the performance, especially in PDEs with solutions that have steep gradients.

Weaknesses 📉

  • Introduced new weighting parameters for balancing the gradient loss terms in the overall PINN loss function.
  • Increased complexity of the model training, as well as a potentially increased computational cost.

2.6 Alternatives 🔀

As this is the first paper that introduced the gradient-enhanced learning paradigm to the PINN field, there are currently no other similar approaches in the same line. In the paper, all the comparisons are conducted between vanilla PINN, gPINN, as well as gPINN+RAR sampling scheme.

3 Potential Future Improvements 🌟

There are several possibilities to further improve the proposed strategy:

  • Automated parameter tuning for the weights of the gradient loss term.
  • Improved selection of residual points to evaluate the extra gradient loss. The current paper uses the same residual points to evaluate both PDE residuals and the gradients of PDE residuals. However, better performance may be achieved if the two sets of residual points are not the same.
  • More efficient automatic differentiation strategy for computing high-order derivatives.

4 Takeaways 📝

In this blog, we looked at enhancing PINN accuracy and training efficiency through gradient-enhanced learning. Here are the highlights of the design pattern proposed in the paper:

  • [Problem]: How to enhance PINNs’ accuracy and training efficiency?
  • [Solution]: Gradient-enhanced learning, where not only the PDE residuals but also their gradients are enforced to be zero in the PINN loss function.
  • [Potential benefits]: 1. Better performance than naive PINN with fewer residual points. 2. Improved accuracy of not only the function predictions but also the function derivatives predictions

As usual, I have prepared a PINN design card to summarize the takeaways:

PINN design pattern proposed in the paper. (Image by this blog author)
PINN design pattern proposed in the paper. (Image by this blog author)

I hope you found this blog useful! To learn more about PINN design patterns, feel free to check out other posts in this series:

Looking forward to sharing more insights with you in the upcoming blogs!


Reference 📑

[1] Yu et al., Gradient-enhanced physics-informed Neural Networks for forward and inverse PDE problems, arXiv, 2021.

[2] Laurent et al., An overview of gradient-enhanced metamodels with applications, Arch Computat Methods Eng, 2019.


Related Articles