Marco Eckhoff and Markus Reiher (2023)
Highlighted by Jan Jensen
While machine learning potentials (MLPs) can give you DFT accuracy at FF costs, they also come with some practical problems: they often need to be retrained from scratch when adding new data to avoid catastrophic forgetting, and most structural descriptors struggle to efficiently represent a large number of different chemical elements.
This paper presents solutions to some of these problems by introducing element-embracing atom-centered symmetry functions (eeACSFs) that incorporate periodic table trends, enabling efficient multi-element handling, and by proposing a lifelong learning framework that includes continual learning strategies, the continual resilient (CoRe) optimizer, and uncertainty quantification to allow MLPs to adapt to new data incrementally without losing prior knowledge.
The eeACSFs differ from conventional ACSFs by integrating element information based on periodic table trends rather than creating separate descriptors for each element combination, which allows them to efficiently handle systems with multiple elements without a combinatorial increase in descriptor size.
The CoRe optimizer is designed to balance efficient convergence with stability, adapting dynamically to the learning context. It combines the robustness of RPROP (resilient backpropagation) with the performance benefits of Adam. Specifically, the optimizer adjusts learning rates based on gradient history, which allows for faster convergence initially and a more stable final accuracy. Additionally, it includes a plasticity factor that selectively freezes parameters critical to prior knowledge while allowing other parameters to adapt. This prevents the “catastrophic forgetting” problem common in continual learning, where new learning can overwrite prior knowledge.
The lifelong learning approach include adaptive selection factors where each data point has a selection factor that updates based on its contribution to the loss function. If a point is well-represented in training, its selection factor decreases, reducing its likelihood of being chosen in future training epochs. Conversely, data that are underrepresented have higher selection factors, ensuring they are revisited. In addition, redundant data (those with low loss contributions) are excluded from the training set, reducing the memory and computational load. Data points that the model consistently fails to learn are also excluded, which improves training efficiency and prevents model instability from conflicting data.
The paper acknowledges that further refinement is needed for scenarios involving the addition of new chemical systems. Although the lMLP can expand its conformation space efficiently, accuracy still falls slightly below training on a single large dataset. Additionally, the method’s application to other MLP architectures and addressing consistency across different electronic states and computational methods remain areas for future work. The authors also suggest that larger and more diverse datasets will be necessary to fully realize the potential of lMLPs in simulating complex chemical systems.
This work is licensed under a Creative Commons Attribution 4.0 International License.