Prediction of Protein Thermodynamic Stability Based on Artificial Intelligence
10.16476/j.pibb.2024.0530
- VernacularTitle:基于人工智能的蛋白质热力学稳定性预测
- Author:
Lin-Jie TAO
1
;
Fan-Ding XU
1
;
Yu GUO
2
;
Jian-Gang LONG
1
;
Zhuo-Yang LU
1
Author Information
1. Institute of Mitochondrial Biomedicine, School of Life Sciences and Technology, Xi’an Jiaotong University, Xi’an 710049, China
2. National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, Xi’an 710049, China
- Publication Type:Journal Article
- Keywords:
machine learning;
protein thermodynamic stability;
mutations
- From:
Progress in Biochemistry and Biophysics
2025;52(8):1972-1985
- CountryChina
- Language:Chinese
-
Abstract:
In recent years, the application of artificial intelligence (AI) in the field of biology has witnessed remarkable advancements. Among these, the most notable achievements have emerged in the domain of protein structure prediction and design, with AlphaFold and related innovations earning the 2024 Nobel Prize in Chemistry. These breakthroughs have transformed our ability to understand protein folding and molecular interactions, marking a pivotal milestone in computational biology. Looking ahead, it is foreseeable that the accurate prediction of various physicochemical properties of proteins—beyond static structure—will become the next critical frontier in this rapidly evolving field. One of the most important protein properties is thermodynamic stability, which refers to a protein’s ability to maintain its native conformation under physiological or stress conditions. Accurate prediction of protein stability, especially upon single-point mutations, plays a vital role in numerous scientific and industrial domains. These include understanding the molecular basis of disease, rational drug design, development of therapeutic proteins, design of more robust industrial enzymes, and engineering of biosensors. Consequently, the ability to reliably forecast the stability changes caused by mutations has broad and transformative implications across biomedical and biotechnological applications. Historically, protein stability was assessed via experimental methods such as differential scanning calorimetry (DSC) and circular dichroism (CD), which, while precise, are time-consuming and resource-intensive. This prompted the development of computational approaches, including empirical energy functions and physics-based simulations. However, these traditional models often fall short in capturing the complex, high-dimensional nature of protein conformational landscapes and mutational effects. Recent advances in machine learning (ML) have significantly improved predictive performance in this area. Early ML models used handcrafted features derived from sequence and structure, whereas modern deep learning models leverage massive datasets and learn representations directly from data. Deep neural networks (DNNs), graph neural networks (GNNs), and attention-based architectures such as transformers have shown particular promise. GNNs, in particular, excel at modeling spatial and topological relationships in molecular structures, making them well-suited for protein modeling tasks. Furthermore, attention mechanisms enable models to dynamically weigh the contribution of specific residues or regions, capturing long-range interactions and allosteric effects. Nevertheless, several key challenges remain. These include the imbalance and scarcity of high-quality experimental datasets, particularly for rare or functionally significant mutations, which can lead to biased or overfitted models. Additionally, the inherently dynamic nature of proteins—their conformational flexibility and context-dependent behavior—is difficult to encode in static structural representations. Current models often rely on a single structure or average conformation, which may overlook important aspects of stability modulation. Efforts are ongoing to incorporate multi-conformational ensembles, molecular dynamics simulations, and physics-informed learning frameworks into predictive models. This paper presents a comprehensive review of the evolution of protein thermodynamic stability prediction techniques, with emphasis on the recent progress enabled by machine learning. It highlights representative datasets, modeling strategies, evaluation benchmarks, and the integration of structural and biochemical features. The aim is to provide researchers with a structured and up-to-date reference, guiding the development of more robust, generalizable, and interpretable models for predicting protein stability changes upon mutation. As the field moves forward, the synergy between data-driven AI methods and domain-specific biological knowledge will be key to unlocking deeper understanding and broader applications of protein engineering.