Classification of cold and hot medicinal properties of Chinese herbal medicines based on graph convolutional network
10.1016/j.dcmed.2025.01.008
- VernacularTitle:基于图卷积网络的中药寒热属性分类研究
- Author:
Mengling YANG
1
;
Wei LIU
1
Author Information
1. 湖南中医药大学信息科学与工程学院,湖南 长沙 410208,中国
- Publication Type:Journal Article
- Keywords:
Chinese herbal medicine;
Cold and hot medicinal properties;
Molecular descriptor;
Graph convolutional network(GCN);
Medicinal property classification
- From:
Digital Chinese Medicine
2024;7(4):356-364
- CountryChina
- Language:Chinese
-
Abstract:
Objective To develop a model based on a graph convolutional network(GCN)to achieve ef-ficient classification of the cold and hot medicinal properties of Chinese herbal medicines(CHMs).Methods After screening the dataset provided in the published literature,this study includ-ed 495 CHMs and their 8 075 compounds.Three molecular descriptors were used to repre-sent the compounds:the molecular access system(MACCS),extended connectivity finger-print(ECFP),and two-dimensional(2D)molecular descriptors computed by the RDKit open-source toolkit(RDKit_2D).A homogeneous graph with CHMs as nodes was constructed and a classification model for the cold and hot medicinal properties of CHMs was developed based on a GCN using the molecular descriptor information of the compounds as node features.Fi-nally,using accuracy and F1 score to evaluate model performance,the GCN model was ex-perimentally compared with the traditional machine learning approaches,including decision tree(DT),random forest(RF),k-nearest neighbor(KNN),Na?ve Bayes classifier(NBC),and support vector machine(SVM).MACCS,ECFP,and RDKit_2D molecular descriptors were al-so adopted as features for comparison.Results The experimental results show that the GCN achieved better performance than the traditional machine learning approach when using MACCS as features,with the accuracy and F1 score reaching 0.836 4 and 0.845 3,respectively.The accuracy and F1 score have increased by 0.869 0 and 0.812 0,respectively,compared with the lowest performing feature combina-tion OMER(only the combination of MACCS,ECFP,and RDKit_2D).The accuracy and F1 score of DT,RF,KNN,NBC,and SVM are 0.505 1 and 0.501 8,0.616 2 and 0.601 5,0.676 8 and 0.624 3,0.616 2 and 0.607 1,0.636 4 and 0.622 5,respectively.Conclusion In this study,by introducing molecular descriptors as features,it is verified that molecular descriptors and fingerprints play a key role in classifying the cold and hot medici-nal properties of CHMs.Meanwhile,excellent classification performance was achieved using the GCN model,providing an important algorithmic basis for the in-depth study of the"struc-ture-property"relationship of CHMs.