Integrating biogravimetric analysis and machine learning for systematic studies of botanical materials: From bioactive constituent identification to production area prediction.
10.1016/j.jpha.2025.101222
- Author:
Sinan WANG
1
;
Huiru XIANG
1
;
Xinyuan PAN
1
;
Jianyang PAN
1
;
Lu ZHAO
1
;
Yi WANG
1
;
Shaoqing CUI
2
;
Yu TANG
1
Author Information
1. Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China.
2. Department of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou, 310058, China.
- Publication Type:Journal Article
- Keywords:
Activity index;
Bioactive substance groups;
Biogravimetric analysis;
Codonopsis radix;
qHNMR
- From:
Journal of Pharmaceutical Analysis
2025;15(10):101222-101222
- CountryChina
- Language:English
-
Abstract:
In general, bioassay-guided fractionation and isolation of bioactive constituents from botanical materials frequently ended up with the reward of a single compound. However, botanical materials typically exert their therapeutic actions through multi-pathway effects due to the intrinsic complex nature of chemical constituents. In addition, the content of bioactive compounds in botanical materials is largely dependent on humidity, temperature, soil, especially geographical origins, from which rapid and accurate identification of plant materials is pressingly needed. These long-standing obstacles collectively impede the deep exploitation and application of these versatile natural sources. To address the challenges, a new paradigm integrating biogravimetric analyses and machine learning-driven origin classification (BAMLOC) was developed. The biogravimetric analyses are based on absolute qHNMR quantification and in vivo zebrafish model-assisted activity index calculation, by which bioactive substance groups jointly responsible for the bioactivities in all fractions are pinpointed before any isolation effort. To differentiate origin-different botanical materials varying in the content of bioactive substance groups, principal component analysis, linear discriminant analysis, and hierarchical cluster analysis in conjunction with supervised support vector machine are employed to classify and predict production areas based on the detection of volatile organic compounds by E-nose and GC-MS. Expanding BAMLOC to Codonopsis Radix enables the identification of polyacetylenes and pyrrolidine alkaloids as the bioactive substance group for immune restoration effect and accurately determines the origins of plants. This study advances the toolbox for the discovery of bioactive compounds from complex mixtures and lays a more definitive foundation for the in-depth utilization of botanical materials.