mirror of
https://github.com/wassname/ml_debug.git
synced 2026-06-27 01:00:14 +08:00
docs(pinn): add Wang 2021 and Rathore 2024 evidence files
This commit is contained in:
@@ -0,0 +1,37 @@
|
|||||||
|
# Rathore et al. 2024 -- Challenges in Training PINNs: A Loss Landscape Perspective
|
||||||
|
|
||||||
|
Source: https://arxiv.org/abs/2402.01868
|
||||||
|
|
||||||
|
## Citation Information
|
||||||
|
|
||||||
|
- Title: Challenges in Training PINNs: A Loss Landscape Perspective
|
||||||
|
- Authors: Pratik Rathore, Weimu Lei, Zachary Frangella, Lu Lu, Madeleine Udell
|
||||||
|
- arXiv ID: 2402.01868 (cs.LG)
|
||||||
|
- Submitted: 2 Feb 2024 (v1), last revised 3 Jun 2024 (v2)
|
||||||
|
- Venue: ICML 2024 Oral
|
||||||
|
- Pages: 33 pages (including appendices), 10 figures, 3 tables
|
||||||
|
|
||||||
|
## Abstract
|
||||||
|
|
||||||
|
This paper explores challenges in training Physics-Informed Neural Networks (PINNs), emphasizing the role of the loss landscape in the training process. We examine difficulties in minimizing the PINN loss function, particularly due to ill-conditioning caused by differential operators in the residual term. We compare gradient-based optimizers Adam, L-BFGS, and their combination Adam+L-BFGS, showing the superiority of Adam+L-BFGS, and introduce a novel second-order optimizer, NysNewton-CG (NNCG), which significantly improves PINN performance. Theoretically, our work elucidates the connection between ill-conditioned differential operators and ill-conditioning in the PINN loss and shows the benefits of combining first- and second-order optimization methods. Our work presents valuable insights and more powerful optimization strategies for training PINNs, which could improve the utility of PINNs for solving difficult partial differential equations.
|
||||||
|
|
||||||
|
## Key Claims and Contributions
|
||||||
|
|
||||||
|
1. **Problem Identification**: Ill-conditioning in PINN loss landscapes caused by differential operators in residual terms
|
||||||
|
2. **Optimizer Comparison**: Empirical evaluation of Adam, L-BFGS, and Adam+L-BFGS for PINN training
|
||||||
|
3. **Novel Method**: Introduction of NysNewton-CG (NNCG), a second-order optimizer with significant performance improvements
|
||||||
|
4. **Theoretical Connection**: Establishes link between ill-conditioned differential operators and ill-conditioning in PINN loss landscape
|
||||||
|
5. **Hybrid Optimization**: Demonstrates benefits of combining first-order and second-order optimization methods
|
||||||
|
|
||||||
|
## Metadata
|
||||||
|
|
||||||
|
- License: CC BY 4.0
|
||||||
|
- Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
|
||||||
|
- DOI: https://doi.org/10.48550/arXiv.2402.01868
|
||||||
|
- Available formats: PDF, HTML (experimental), TeX Source
|
||||||
|
|
||||||
|
## Access
|
||||||
|
|
||||||
|
- PDF: https://arxiv.org/pdf/2402.01868
|
||||||
|
- HTML: https://arxiv.org/html/2402.01868v2
|
||||||
|
- TeX Source: https://arxiv.org/src/2402.01868
|
||||||
@@ -0,0 +1,34 @@
|
|||||||
|
# Wang et al. 2021 -- Understanding and Mitigating Gradient Flow Pathologies in Physics-informed Neural Networks
|
||||||
|
Source: https://arxiv.org/abs/2001.04536
|
||||||
|
|
||||||
|
## Paper Metadata
|
||||||
|
|
||||||
|
- **Authors:** Sifan Wang, Yujun Teng, Paris Perdikaris
|
||||||
|
- **Submitted:** 13 Jan 2020
|
||||||
|
- **arXiv ID:** 2001.04536 [cs.LG]
|
||||||
|
- **DOI:** https://doi.org/10.48550/arXiv.2001.04536
|
||||||
|
- **Length:** 28 pages, 18 figures
|
||||||
|
- **Subjects:** Machine Learning (cs.LG); Numerical Analysis (math.NA); Machine Learning (stat.ML)
|
||||||
|
- **Code & Data:** https://github.com/PredictiveIntelligenceLab/GradientPathologiesPINNs
|
||||||
|
|
||||||
|
## Abstract
|
||||||
|
|
||||||
|
The widespread use of neural networks across different scientific domains often involves constraining them to satisfy certain symmetries, conservation laws, or other domain knowledge. Such constraints are often imposed as soft penalties during model training and effectively act as domain-specific regularizers of the empirical risk loss. Physics-informed neural networks is an example of this philosophy in which the outputs of deep neural networks are constrained to approximately satisfy a given set of partial differential equations.
|
||||||
|
|
||||||
|
In this work we review recent advances in scientific machine learning with a specific focus on the effectiveness of physics-informed neural networks in predicting outcomes of physical systems and discovering hidden physics from noisy data. We will also identify and analyze a fundamental mode of failure of such approaches that is related to numerical stiffness leading to unbalanced back-propagated gradients during model training.
|
||||||
|
|
||||||
|
To address this limitation we present a learning rate annealing algorithm that utilizes gradient statistics during model training to balance the interplay between different terms in composite loss functions. We also propose a novel neural network architecture that is more resilient to such gradient pathologies.
|
||||||
|
|
||||||
|
Taken together, our developments provide new insights into the training of constrained neural networks and consistently improve the predictive accuracy of physics-informed neural networks by a factor of 50-100x across a range of problems in computational physics.
|
||||||
|
|
||||||
|
## Key Contributions
|
||||||
|
|
||||||
|
1. **Problem Identification:** Identifies gradient flow pathologies in PINNs arising from numerical stiffness, causing unbalanced back-propagated gradients
|
||||||
|
2. **Learning Rate Annealing:** Proposes an algorithm using gradient statistics to balance different loss components during training
|
||||||
|
3. **Novel Architecture:** Introduces a new neural network architecture more resilient to gradient pathologies
|
||||||
|
4. **Empirical Results:** Demonstrates 50-100x improvement in predictive accuracy across computational physics problems
|
||||||
|
|
||||||
|
## Access
|
||||||
|
|
||||||
|
- PDF: https://arxiv.org/pdf/2001.04536
|
||||||
|
- TeX Source: https://arxiv.org/src/2001.04536
|
||||||
Reference in New Issue
Block a user