Predicting cardiotoxicity in drug development:A deep learning approach
10.1016/j.jpha.2025.101263
- Author:
Kaifeng LIU
1
;
Huizi CUI
1
;
Xiangyu YU
1
;
Wannan LI
1
;
Weiwei HAN
1
Author Information
1. Key Laboratory for Molecular Enzymology and Engineering of Ministry of Education,Edmond H.Fischer Signal Transduction Laboratory,School of Life Sciences,Jilin University,Changchun,130012,China
- Publication Type:Journal Article
- Keywords:
Cardiotoxicity;
Human ether-à-go-go related gene channel;
Deep learning;
Molecular fingerprint;
Drug development
- From:
Journal of Pharmaceutical Analysis
2025;15(8):1774-1786
- CountryChina
- Language:English
-
Abstract:
Cardiotoxicity is a critical issue in drug development that poses serious health risks,including potentially fatal arrhythmias.The human ether-à-go-go related gene(hERG)potassium channel,as one of the pri-mary targets of cardiotoxicity,has garnered widespread attention.Traditional cardiotoxicity testing methods are expensive and time-consuming,making computational virtual screening a suitable alter-native.In this study,we employed machine learning techniques utilizing molecular fingerprints and descriptors to predict the cardiotoxicity of compounds,with the aim of improving prediction accuracy and efficiency.We used four types of molecular fingerprints and descriptors combined with machine learning and deep learning algorithms,including Gaussian naive Bayes(NB),random forest(RF),support vector machine(SVM),K-nearest neighbors(KNN),eXtreme gradient boosting(XGBoost),and Trans-former models,to build predictive models.Our models demonstrated advanced predictive performance.The best machine learning model,XGBoost Morgan,achieved an accuracy(ACC)value of 0.84,and the deep learning model,Transformer_Morgan,achieved the best ACC value of 0.85,showing a high ability to distinguish between toxic and non-toxic compounds.On an external independent validation set,it achieved the best area under the curve(AUC)value of 0.93,surpassing ADMETlab3.0,Cardpred,and CardioDPi.In addition,we explored the integration of molecular descriptors and fingerprints to enhance model performance and found that ensemble methods,such as voting and stacking,provided slight improvements in model stability.Furthermore,the SHapley Additive exPlanations(SHAP)explanations revealed the relationship between benzene rings,fluorine-containing groups,NH groups,oxygen in ether groups,and cardiotoxicity,highlighting the importance of these features.This study not only improved the predictive accuracy of cardiotoxicity models but also promoted a more reliable and scientifically interpretable method for drug safety assessment.Using computational methods,this study facilitates a more efficient drug development process,reduces costs,and improves the safety of new drug candidates,ultimately benefiting medical and public health.