1.Hub biomarkers and their clinical relevance in glycometabolic disorders: A comprehensive bioinformatics and machine learning approach.
Liping XIANG ; Bing ZHOU ; Yunchen LUO ; Hanqi BI ; Yan LU ; Jian ZHOU
Chinese Medical Journal 2025;138(16):2016-2027
BACKGROUND:
Gluconeogenesis is a critical metabolic pathway for maintaining glucose homeostasis, and its dysregulation can lead to glycometabolic disorders. This study aimed to identify hub biomarkers of these disorders to provide a theoretical foundation for enhancing diagnosis and treatment.
METHODS:
Gene expression profiles from liver tissues of three well-characterized gluconeogenesis mouse models were analyzed to identify commonly differentially expressed genes (DEGs). Weighted gene co-expression network analysis (WGCNA), machine learning techniques, and diagnostic tests on transcriptome data from publicly available datasets of type 2 diabetes mellitus (T2DM) patients were employed to assess the clinical relevance of these DEGs. Subsequently, we identified hub biomarkers associated with gluconeogenesis-related glycometabolic disorders, investigated potential correlations with immune cell types, and validated expression using quantitative polymerase chain reaction in the mouse models.
RESULTS:
Only a few common DEGs were observed in gluconeogenesis-related glycometabolic disorders across different contributing factors. However, these DEGs were consistently associated with cytokine regulation and oxidative stress (OS). Enrichment analysis highlighted significant alterations in terms related to cytokines and OS. Importantly, osteomodulin ( OMD ), apolipoprotein A4 ( APOA4 ), and insulin like growth factor binding protein 6 ( IGFBP6 ) were identified with potential clinical significance in T2DM patients. These genes demonstrated robust diagnostic performance in T2DM cohorts and were positively correlated with resting dendritic cells.
CONCLUSIONS
Gluconeogenesis-related glycometabolic disorders exhibit considerable heterogeneity, yet changes in cytokine regulation and OS are universally present. OMD , APOA4 , and IGFBP6 may serve as hub biomarkers for gluconeogenesis-related glycometabolic disorders.
Machine Learning
;
Humans
;
Computational Biology/methods*
;
Biomarkers/metabolism*
;
Diabetes Mellitus, Type 2/genetics*
;
Animals
;
Mice
;
Gluconeogenesis/physiology*
;
Gene Expression Profiling
;
Transcriptome/genetics*
;
Gene Regulatory Networks/genetics*
;
Clinical Relevance
2.Computational pathology in precision oncology: Evolution from task-specific models to foundation models.
Yuhao WANG ; Yunjie GU ; Xueyuan ZHANG ; Baizhi WANG ; Rundong WANG ; Xiaolong LI ; Yudong LIU ; Fengmei QU ; Fei REN ; Rui YAN ; S Kevin ZHOU
Chinese Medical Journal 2025;138(22):2868-2878
With the rapid development of artificial intelligence, computational pathology has been seamlessly integrated into the entire clinical workflow, which encompasses diagnosis, treatment, prognosis, and biomarker discovery. This integration has significantly enhanced clinical accuracy and efficiency while reducing the workload for clinicians. Traditionally, research in this field has depended on the collection and labeling of large datasets for specific tasks, followed by the development of task-specific computational pathology models. However, this approach is labor intensive and does not scale efficiently for open-set identification or rare diseases. Given the diversity of clinical tasks, training individual models from scratch to address the whole spectrum of clinical tasks in the pathology workflow is impractical, which highlights the urgent need to transition from task-specific models to foundation models (FMs). In recent years, pathological FMs have proliferated. These FMs can be classified into three categories, namely, pathology image FMs, pathology image-text FMs, and pathology image-gene FMs, each of which results in distinct functionalities and application scenarios. This review provides an overview of the latest research advancements in pathological FMs, with a particular emphasis on their applications in oncology. The key challenges and opportunities presented by pathological FMs in precision oncology are also explored.
Humans
;
Precision Medicine/methods*
;
Medical Oncology/methods*
;
Artificial Intelligence
;
Neoplasms/pathology*
;
Computational Biology/methods*
3.Preliminary exploration of multi-omics data fusion methods for high-dimensional small-sample datasets in traditional Chinese medicine.
Nian WANG ; Cheng-Cheng YU ; Hu YANG ; Zhong WANG ; Jun LIU
China Journal of Chinese Materia Medica 2025;50(1):278-284
With the advancement in big data and artificial intelligence technologies, the extensive application of omics technologies in traditional Chinese medicine(TCM) research has generated large experimental datasets, enabling the exploration of cross-scale correlations among massive data and thereby resulting in the shift toward a data-intensive research paradigm. The emerging approach of multi-omics data fusion analysis, emphasizing technical and computational tools, presents a potential breakthrough in this field. The holistic perspective of TCM aligns with the concept of multi-omics data fusion, yet the data types encountered exhibit high dimensionality with small sample sizes, necessitating data processing techniques such as dimensionality reduction. The current challenge lies in selecting suitable analytical methods for these data to enhance the systematic understanding of physiological functions and disease diagnosis/treatment processes. This paper explores the theories and frameworks of multi-omics data fusion, analyzes methods for fusing high-dimensional, small-sample multi-omics data in TCM, and aims to provide insights for advancing TCM research.
Medicine, Chinese Traditional/methods*
;
Humans
;
Computational Biology/methods*
;
Genomics/methods*
;
Sample Size
;
Artificial Intelligence
;
Multiomics
4.Prediction of immunotherapy targets for chronic cerebral hypoperfusion by bioinformatics method.
Mei ZHAO ; Yanpeng XUE ; Qingqing TIAN ; He YANG ; Qing JIANG ; Mengfan YU ; Xin CHEN
Journal of Biomedical Engineering 2025;42(2):382-388
Chronic cerebral hypoperfusion (CCH) plays an important role in the occurrence and development of vascular dementia (VD). Recent studies have indicated that multiple stages of immune-inflammatory response are involved in the process of cerebral ischemia, drawing increasing attention to immune therapies for cerebral ischemia. This study aims to identify potential immune therapeutic targets for CCH using bioinformatics methods from an immunological perspective. We identified a total of 823 differentially expressed genes associated with CCH, and further screened for 9 core immune-related genes, namely RASGRP1, FGF12, SEMA7A, PAK6, EDN3, BPHL, FCGRT, HSPA1B and MLNR. Gene enrichment analysis showed that core genes were mainly involved in biological functions such as cell growth, neural projection extension, and mesenchymal stem cell migration. Biological signaling pathway analysis indicated that core genes were mainly involved in the regulation of T cell receptor, Ras and MAPK signaling pathways. Through LASSO regression, we identified RASGRP1 and BPHL as key immune-related core genes. Additionally, by integrating differential miRNAs and the miRwalk database, we identified miR-216b-5p as a key immune-related miRNA that regulates RASGRP1. In summary, the predicted miR-216b-5p/ RASGRP1 signaling pathway plays a significant role in immune regulation during CCH, which may provide new targets for immune therapy in CCH.
Humans
;
Computational Biology/methods*
;
Brain Ischemia/therapy*
;
Immunotherapy
;
MicroRNAs/genetics*
;
Signal Transduction
;
Dementia, Vascular/genetics*
;
Chronic Disease
5.Unveiling the molecular features and diagnosis and treatment prospects of immunothrombosis via integrated bioinformatics analysis.
Yafen WANG ; Xiaoshuang WU ; Zhixin LIU ; Xinlei LI ; Yaozhen CHEN ; Ning AN ; Xingbin HU
Chinese Journal of Cellular and Molecular Immunology 2025;41(3):228-235
Objective To investigate the common molecular features of immunothrombosis, thus enhancing the comprehension of thrombosis triggered by immune and inflammatory responses and offering crucial insights for identifying potential diagnostic and therapeutic targets. Methods Differential gene expression analysis and functional enrichment analysis were conducted on datasets of systemic lupus erythematosus (SLE) and venous thromboembolism (VTE). The intersection of differentially expressed genes in SLE and VTE with those of neutrophil extracellular traps (NET) yielded cross-talk genes (CG) for SLE-NET and VTE-NET interaction. Further analysis included functional enrichment and protein-protein interaction (PPI) network assessments of these CG to identify hub genes. Venn diagrams and receiver operating characteristic (ROC) curve analysis were employed to pinpoint the most effective shared diagnostic CG, which were validated using a graft-versus-host disease (GVHD) dataset. Results Differential expression genes in SLE and VTE were associated with distinct biological processes, whereas SLE-NET-CG and VTE-NET-CG were implicated in pathways related to leukocyte migration, inflammatory response, and immune response. Through PPI network analysis, several hub genes were identified, with matrix metalloproteinase 9 (MMP9) and S100 calcium-binding protein A12 (S100A12) emerging as the best shared diagnostic CG for SLE (AUC: 0.936 and 0.832) and VTE (AUC: 0.719 and 0.759). Notably, MMP9 exhibited good diagnostic performance in the GVHD dataset (AUC: 0.696). Conclusion This study unveils the common molecular features of SLE, VTE, and NET, emphasizing MMP9 and S100A12 as the optimal shared diagnostic CG, thus providing valuable evidence for the diagnosis and therapeutic strategies related to immunothrombosis. Additionally, the expression of MMP9 in GVHD highlights its critical role in the risk of VTE associated with immune system disorders.
Humans
;
Computational Biology/methods*
;
Lupus Erythematosus, Systemic/immunology*
;
Protein Interaction Maps/genetics*
;
Venous Thromboembolism/therapy*
;
Matrix Metalloproteinase 9/genetics*
;
Extracellular Traps/metabolism*
;
Gene Regulatory Networks
;
Thrombosis/immunology*
;
Graft vs Host Disease/genetics*
;
Gene Expression Profiling
6.Single-cell transcriptomics combined with bioinformatics for comprehensive analysis of macrophage subpopulations and hub genes in ischemic stroke.
Jingyao XU ; Xiaolu WANG ; Shuai HOU ; Meng PANG ; Gang WANG ; Yanqiang WANG
Chinese Journal of Cellular and Molecular Immunology 2025;41(6):505-513
Objective To explore macrophage subpopulations in ischemic stroke (IS) by using single-cell RNA sequencing (scRNA-seq) data analysis and High-Dimensional Weighted Gene Co-Expression Network Analysis (hdWGCNA). Methods Based on single-cell sequencing data, transcriptomic information for different cell types was obtained, and macrophages were selected for subpopulation identification. hdWGCNA, cell-cell communication, and pseudotime trajectory analysis were used to explore the characteristics of macrophage subpopulations following IS. Key genes related to IS were identified using microarray data and validated for diagnostic potential through Receiver Operating Characteristic (ROC) analysis. Gene Set Enrichment Analysis (GSEA) was conducted to investigate the potential functions of these genes. Results The scRNA-seq data analysis revealed significant changes in macrophage subpopulation composition after IS. A specific macrophage subpopulation enriched in the stroke group was identified and designated as MCAO-specific macrophages (MSM). Pseudotime trajectory analysis indicated that MSM cells were in an intermediate stage of macrophage differentiation. Cell-cell communication analysis uncovered complex interactions between MSM cells and other cells, with the CCL6-CCR1 signaling axis potentially playing a crucial role in neuroinflammation. Two gene modules associated with MSM were identified via hdWGCNA, significantly enriched in pathways related to NOD-like receptors and antigen processing. By integrating differentially expressed MSM genes with conventional transcriptomic data, three IS-related hub genes were identified: Arg1, CLEC4D, and CLEC4E. Conclusion This study reveals the characteristics and functions of macrophage subpopulations following IS and identifies three hub genes with potential diagnostic value, providing novel insights into the pathological mechanisms of IS.
Macrophages/metabolism*
;
Computational Biology/methods*
;
Single-Cell Analysis/methods*
;
Transcriptome
;
Ischemic Stroke/metabolism*
;
Animals
;
Gene Regulatory Networks
;
Gene Expression Profiling
;
Humans
;
Male
7.Identification of prognosis-related key genes in hepatocellular carcinoma based on bioinformatics analysis.
Qian XIE ; Yingshan ZHU ; Ge HUANG ; Yue ZHAO
Journal of Central South University(Medical Sciences) 2025;50(2):167-180
OBJECTIVES:
Hepatocellular carcinoma is one of the most common primary malignant tumors with the third highest mortality rate worldwide. This study aims to identify key genes associated with hepatocellular carcinoma prognosis using the Gene Expression Omnibus (GEO) database and provide a theoretical basis for discovering novel prognostic biomarkers for hepatocellular carcinoma.
METHODS:
Hepatocellular carcinoma-related datasets were retrieved from the GEO database. Differentially expressed genes (DEGs) were identified using the GEO2R tool. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were performed using the Database for Annotation, Visualization, and Integrated Discovery (DAVID). A protein-protein interaction (PPI) network was constructed using the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING), and key genes were identified using Cytoscape software. The University of Alabama at Birmingham Cancer Data Analysis Resource (UALCAN) was used to analyze the expression levels of key genes in normal and hepatocellular carcinoma tissues, as well as their associations with pathological grade, clinical stage, and patient survival. The Human Protein Atlas (THPA) was used to further validate the impact of key genes on overall survival. Expression levels of key genes in the blood of hepatocellular carcinoma patients were evaluated using the expression atlas of blood-based biomarkers in the early diagnosis of cancers (BBCancer).
RESULTS:
A total of 78 DEGs were identified from the GEO database. GO and KEGG analyses indicated that these genes may contribute to hepatocellular carcinoma progression by promoting cell division and regulating protein kinase activity. Sixteen key genes were screened via Cytoscape and validated using UALCAN and THPA. These genes were overexpressed in hepatocellular carcinoma tissues and were associated with disease progression and poor prognosis. Finally, BBCancer analysis showed that ASPM and NCAPG were also elevated in the blood of hepatocellular carcinoma patients.
CONCLUSIONS
This study identified 16 key genes as potential prognostic biomarkers for hepatocellular carcinoma, among which ASPM and NCAPG may serve as promising blood-based markers for hepatocellular carcinoma.
Humans
;
Carcinoma, Hepatocellular/mortality*
;
Liver Neoplasms/pathology*
;
Prognosis
;
Computational Biology/methods*
;
Protein Interaction Maps/genetics*
;
Biomarkers, Tumor/genetics*
;
Gene Expression Regulation, Neoplastic
;
Gene Expression Profiling
;
Gene Ontology
;
Databases, Genetic
8.Identification of shared key genes and pathways in osteoarthritis and sarcopenia patients based on bioinformatics analysis.
Yuyan SUN ; Ziyu LUO ; Huixian LING ; Sha WU ; Hongwei SHEN ; Yuanyuan FU ; Thainamanh NGO ; Wen WANG ; Ying KONG
Journal of Central South University(Medical Sciences) 2025;50(3):430-446
OBJECTIVES:
Osteoarthritis (OA) and sarcopenia are significant health concerns in the elderly, substantially impacting their daily activities and quality of life. However, the relationship between them remains poorly understood. This study aims to uncover common biomarkers and pathways associated with both OA and sarcopenia.
METHODS:
Gene expression profiles related to OA and sarcopenia were retrieved from the Gene Expression Omnibus (GEO) database. Differentially expressed genes (DEGs) between disease and control groups were identified using R software. Common DEGs were extracted via Venn diagram analysis. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were conducted to identify biological processes and pathways associated with shared DEGs. Protein-protein interaction (PPI) networks were constructed, and candidate hub genes were ranked using the maximal clique centrality (MCC) algorithm. Further validation of hub gene expression was performed using 2 independent datasets. Receiver operating characteristic (ROC) curve analysis was used to evaluate the predictive value of key genes for OA and sarcopenia. Mouse models of OA and sarcopenia were established. Hematoxylin-eosin and Safranin O/Fast Green staining were used to validate the OA model. The sarcopenia model was validated via rotarod testing and quadriceps muscle mass measurement. Real-time reverse transcription PCR (real-time RT-PCR) was employed to assess the mRNA expression levels of candidate key genes in both models. Gene set enrichment analysis (GSEA) was conducted to identify pathways associated with the selected shared key genes in both diseases.
RESULTS:
A total of 89 common DEGs were identified in the gene expression profiles of OA and sarcopenia, including 76 upregulated and 13 downregulated genes. These 89 DEGs were significantly enriched in protein digestion and absorption, the PI3K-Akt signaling pathway, and extracellular matrix-receptor interaction. PPI network analysis and MCC algorithm analysis of the 89 common DEGs identified the top 17 candidate hub genes. Based on the differential expression analysis of these 17 candidate hub genes in the validation datasets, AEBP1 and COL8A2 were ultimately selected as the common key genes for both diseases, both of which showed a significant upregulation trend in the disease groups (all P<0.05). The value of area under the curve (AUC) for AEBP1 and COL8A2 in the OA and sarcopenia datasets were all greater than 0.7, indicating that both genes have potential value in predicting OA and sarcopenia. Real-time RT-PCR results showed that the mRNA expression levels of AEBP1 and COL8A2 were significantly upregulated in the disease groups (all P<0.05), consistent with the results observed in the bioinformatics analysis. GSEA revealed that AEBP1 and COL8A2 were closely related to extracellular matrix-receptor interaction, ribosome, and oxidative phosphorylation in OA and sarcopenia.
CONCLUSIONS
AEBP1 and COL8A2 have the potential to serve as common biomarkers for OA and sarcopenia. The extracellular matrix-receptor interaction pathway may represent a potential target for the prevention and treatment of both OA and sarcopenia.
Sarcopenia/genetics*
;
Osteoarthritis/genetics*
;
Computational Biology/methods*
;
Humans
;
Protein Interaction Maps/genetics*
;
Animals
;
Mice
;
Gene Expression Profiling
;
Gene Ontology
;
Transcriptome
;
Male
;
Signal Transduction/genetics*
;
Gene Regulatory Networks
9.Construction of a treatment response prediction model for multiple myeloma based on multi-omics and machine learning.
Xionghui ZHOU ; Rong GUI ; Jing LIU ; Meng GAO
Journal of Central South University(Medical Sciences) 2025;50(4):531-544
OBJECTIVES:
Multiple myeloma (MM) is a hematologic malignancy characterized by clonal proliferation of plasma cells and remains incurable. Patients with primary refractory multiple myeloma (PRMM) show poor response to initial induction therapy. This study aims to develop a machine learning-based model to predict treatment response in newly diagnosed multiple myeloma (NDMM) patients, in order to optimize therapeutic strategies.
METHODS:
NDMM and post-treatment MM patients hospitalized in the Department of Hematology, Third Xiangya Hospital, Central South University, between August 2022 and July 2023 were enrolled. Post-treatment MM patients were categorized into PRMM patients and treatment-responsive MM (TRMM) patients based on therapeutic efficacy. Serum metabolites were detected and analyzed via metabolomics. Based on the metabolomics analysis results and combined with transcriptomic sequencing data of NDMM patients from databases, differentially expressed amino acid metabolism-related genes (AAMGs) among post-treatment NDMM patients with varying therapeutic outcomes were screened. Using bioinformatics analyses and machine learning algorithms, a predictive model for treatment response in NDMM was constructed and used to identify patients at risk for PRMM.
RESULTS:
A total of 61 patients were included: 22 NDMM, 23 TRMM, and 16 PRMM patients. Significant differences in metabolite levels were observed among the 3 groups, with differential metabolites mainly enriched in amino acid metabolism pathways. Follow-up data were available for 16 of the 22 NDMM patients, including 12 treatment responders (ND_TR group) and 4 with PRMM (ND_PR group). A total of 23 differential metabolites were identified between these 2 groups: 6 metabolites (e.g., tryptophan) were upregulated and 17 (e.g., citric acid) were downregulated in the ND_TR group. Transcriptomic data from 108 TRMM and 77 PRMM patients were analyzed to identify differentially expressed AAMGs, which were then used to construct a prediction model. The area under the receiver operating characteristic curve (AUC) for the model exceeded 0.8, and AUC values in 3 external validation cohorts were all above 0.7.
CONCLUSIONS
This study delineated the metabolic alterations in MM patients with different treatment response, suggesting that dysregulated amino acid metabolism may be associated with poor treatment response in PRMM. By integrating metabolomics and transcriptomics, a machine learning-based predictive model was successfully established to forecast treatment response in NDMM patients.
Humans
;
Multiple Myeloma/drug therapy*
;
Machine Learning
;
Male
;
Female
;
Metabolomics/methods*
;
Middle Aged
;
Aged
;
Treatment Outcome
;
Transcriptome
;
Computational Biology
;
Adult
;
Multiomics
10.Databases, knowledge bases, and large models for biomanufacturing.
Zhitao MAO ; Xiaoping LIAO ; Hongwu MA
Chinese Journal of Biotechnology 2025;41(3):901-916
Biomanufacturing is an advanced manufacturing method that integrates biology, chemistry, and engineering. It utilizes renewable biomass and biological organisms as production media to scale up the production of target products through fermentation. Compared with petrochemical routes, biomanufacturing offers significant advantages in reducing CO2 emissions, lowering energy consumption, and cutting costs. With the development of systems biology and synthetic biology and the accumulation of bioinformatics data, the integration of information technologies such as artificial intelligence, large models, and high-performance computing with biotechnology is propelling biomanufacturing into a data-driven era. This paper reviews the latest research progress on databases, knowledge bases, and large language models for biomanufacturing. It explores the development directions, challenges, and emerging technical methods in this field, aiming to provide guidance and inspiration for scientific research in related areas.
Biotechnology/methods*
;
Knowledge Bases
;
Synthetic Biology
;
Databases, Factual
;
Artificial Intelligence
;
Systems Biology
;
Computational Biology
;
Fermentation

Result Analysis
Print
Save
E-mail