1.Serological characteristics and bioinformatics analysis of 4 blood donors with RHCE*cE(281C,282T) variant allele.
Fan WU ; Naibao ZHUANG ; Liyan SUN ; Tong LIU ; Yanlian LIANG ; Shuang LIANG
Chinese Journal of Medical Genetics 2025;42(2):137-144
OBJECTIVE:
To explore the serological characteristics and bioinformatics analysis results of 4 blood donors with RHCE*cE(281C, 282T) variant allele.
METHODS:
A total of 4 non-related blood donors with RHCE*cE (281C, 282T) variant allele (donors 1-4) were selected as the study objects. They donated blood at Shenzhen Blood Center from January 2022 to June 2023. The 4 blood donors were all Han. And 5 mL elbow venous blood was collected from these 4 blood donors. Regular serological assaying with 4 kinds of monoclonal antibody reagents was used for determination of the RhCcEe type. The nucleotide sequences of all 10 exons and adjacent flanking intron regions of RHCE gene in these 4 donors were analyzed by Sanger sequencing, and the full-length haplotype analysis of RHCE gene was performed by using the single-molecule real-time sequencing (SMRT) third-generation technology. DeepTMHMM software was used to analyze the structure of protein transmembrane region of wild type and variant RhCcEe protein and predict the location of amino acid substitution. The effects of mutations on RhCcEe protein function were analyzed using PolyPhen-2, SIFT and Mutation Taster bioinformatics software. Robetta and Swiss-PdbViewer v4.1.0 were used for modeling the tertiary structures of RhCcEe to analyze the difference between wild type and variant RhCcEe protein. The mutation was rated according to the standards and guidelines for the classification of genetic variants of the American College of Medical Genetics and Genomics (ACMG). This study has been approved by the Medical Ethics Committee of Shenzhen Blood Center (Approval No. SZBCMEC-2022-024).
RESULTS:
The RhCcEe phenotypes of the 4 blood donors were CCEweake by serological assaying. The RhE antigen were weakly expressed form 0 to 3+. The analysis of RHCE gene sequence indicated that all the 4 donors with RHCE*cE (281C, 282T) allele. The mutation caused the substitution of a single amino acid in the RhCcEe protein (p.Leu94 Pro) and the amino acid substitution was located in the transmembrane α3 chain resulted in significant changes in the 3D structure of the extracellular region of RhCcEe protein. The substitution was predicted to be "Probably damaging", "Damaging" and "Polymorphism" by PolyPhen-2, SIFT and Mutation Taster bioinformatics software. According to the guidelines of ACMG, the variant was rated to be likely pathogenic.
CONCLUSION
The RHCE*cE (281C, 282T) variant allele was first found in the Han Chinese population. The serological data of this allele were enriched. It provides an important guarantee for the safety of blood transfusion. Bioinformatics analysis provided evidences for further study of the structure and functions of RhCcEe protein.
Humans
;
Blood Donors
;
Computational Biology/methods*
;
Alleles
;
Rh-Hr Blood-Group System/genetics*
;
Male
;
Female
;
Adult
;
Exons
2.Hub biomarkers and their clinical relevance in glycometabolic disorders: A comprehensive bioinformatics and machine learning approach.
Liping XIANG ; Bing ZHOU ; Yunchen LUO ; Hanqi BI ; Yan LU ; Jian ZHOU
Chinese Medical Journal 2025;138(16):2016-2027
BACKGROUND:
Gluconeogenesis is a critical metabolic pathway for maintaining glucose homeostasis, and its dysregulation can lead to glycometabolic disorders. This study aimed to identify hub biomarkers of these disorders to provide a theoretical foundation for enhancing diagnosis and treatment.
METHODS:
Gene expression profiles from liver tissues of three well-characterized gluconeogenesis mouse models were analyzed to identify commonly differentially expressed genes (DEGs). Weighted gene co-expression network analysis (WGCNA), machine learning techniques, and diagnostic tests on transcriptome data from publicly available datasets of type 2 diabetes mellitus (T2DM) patients were employed to assess the clinical relevance of these DEGs. Subsequently, we identified hub biomarkers associated with gluconeogenesis-related glycometabolic disorders, investigated potential correlations with immune cell types, and validated expression using quantitative polymerase chain reaction in the mouse models.
RESULTS:
Only a few common DEGs were observed in gluconeogenesis-related glycometabolic disorders across different contributing factors. However, these DEGs were consistently associated with cytokine regulation and oxidative stress (OS). Enrichment analysis highlighted significant alterations in terms related to cytokines and OS. Importantly, osteomodulin ( OMD ), apolipoprotein A4 ( APOA4 ), and insulin like growth factor binding protein 6 ( IGFBP6 ) were identified with potential clinical significance in T2DM patients. These genes demonstrated robust diagnostic performance in T2DM cohorts and were positively correlated with resting dendritic cells.
CONCLUSIONS
Gluconeogenesis-related glycometabolic disorders exhibit considerable heterogeneity, yet changes in cytokine regulation and OS are universally present. OMD , APOA4 , and IGFBP6 may serve as hub biomarkers for gluconeogenesis-related glycometabolic disorders.
Machine Learning
;
Humans
;
Computational Biology/methods*
;
Biomarkers/metabolism*
;
Diabetes Mellitus, Type 2/genetics*
;
Animals
;
Mice
;
Gluconeogenesis/physiology*
;
Gene Expression Profiling
;
Transcriptome/genetics*
;
Gene Regulatory Networks/genetics*
;
Clinical Relevance
3.Computational pathology in precision oncology: Evolution from task-specific models to foundation models.
Yuhao WANG ; Yunjie GU ; Xueyuan ZHANG ; Baizhi WANG ; Rundong WANG ; Xiaolong LI ; Yudong LIU ; Fengmei QU ; Fei REN ; Rui YAN ; S Kevin ZHOU
Chinese Medical Journal 2025;138(22):2868-2878
With the rapid development of artificial intelligence, computational pathology has been seamlessly integrated into the entire clinical workflow, which encompasses diagnosis, treatment, prognosis, and biomarker discovery. This integration has significantly enhanced clinical accuracy and efficiency while reducing the workload for clinicians. Traditionally, research in this field has depended on the collection and labeling of large datasets for specific tasks, followed by the development of task-specific computational pathology models. However, this approach is labor intensive and does not scale efficiently for open-set identification or rare diseases. Given the diversity of clinical tasks, training individual models from scratch to address the whole spectrum of clinical tasks in the pathology workflow is impractical, which highlights the urgent need to transition from task-specific models to foundation models (FMs). In recent years, pathological FMs have proliferated. These FMs can be classified into three categories, namely, pathology image FMs, pathology image-text FMs, and pathology image-gene FMs, each of which results in distinct functionalities and application scenarios. This review provides an overview of the latest research advancements in pathological FMs, with a particular emphasis on their applications in oncology. The key challenges and opportunities presented by pathological FMs in precision oncology are also explored.
Humans
;
Precision Medicine/methods*
;
Medical Oncology/methods*
;
Artificial Intelligence
;
Neoplasms/pathology*
;
Computational Biology/methods*
4.Databases, knowledge bases, and large models for biomanufacturing.
Zhitao MAO ; Xiaoping LIAO ; Hongwu MA
Chinese Journal of Biotechnology 2025;41(3):901-916
Biomanufacturing is an advanced manufacturing method that integrates biology, chemistry, and engineering. It utilizes renewable biomass and biological organisms as production media to scale up the production of target products through fermentation. Compared with petrochemical routes, biomanufacturing offers significant advantages in reducing CO2 emissions, lowering energy consumption, and cutting costs. With the development of systems biology and synthetic biology and the accumulation of bioinformatics data, the integration of information technologies such as artificial intelligence, large models, and high-performance computing with biotechnology is propelling biomanufacturing into a data-driven era. This paper reviews the latest research progress on databases, knowledge bases, and large language models for biomanufacturing. It explores the development directions, challenges, and emerging technical methods in this field, aiming to provide guidance and inspiration for scientific research in related areas.
Biotechnology/methods*
;
Knowledge Bases
;
Synthetic Biology
;
Databases, Factual
;
Artificial Intelligence
;
Systems Biology
;
Computational Biology
;
Fermentation
5.Artificial intelligence-enhanced physics-based computational modeling technologies for proteins.
Baoyan LIU ; Shuai LI ; Hao SU ; Xiang SHENG
Chinese Journal of Biotechnology 2025;41(3):917-933
Computational modeling is an invaluable tool for mechanism analysis, directed engineering, and rational design of biological parts, metabolic networks, and even cellular systems. It can provide new technological solutions to address biological challenges at different levels and has become a central focus of research in biomanufacturing. In the computational modeling of proteins, which are the key parts in biological systems, the traditional physics-based methods (computer software and mathematical model) have been widely used to study the physical and chemical processes in the functioning of proteins, and have thus been recognized as a powerful tool for understanding complex biological systems and guiding experimental designs. As the scale of computational modeling continues to expand, traditional modeling techniques face difficulties in balancing computational accuracy and speed. In recent years, the explosive growth of biological data has made it possible to construct high-performance artificial intelligence (AI) models, which brings new opportunities to the computational modeling of proteins, and the AI-enhanced physics-based computational modeling technologies have emerged. This combined strategy not only incorporates the chemical knowledge and established physical principles but also is powerful in data processing and pattern recognition, which greatly improves the computational efficiency and prediction accuracy, as well as possesses stronger interpretation ability, transferability, and robustness. The AI-enhanced physics-based computational modeling technologies have already shown great potential and value in biocatalysis, paving a new way for the future development of biomanufacturing.
Artificial Intelligence
;
Proteins/chemistry*
;
Computer Simulation
;
Software
;
Computational Biology/methods*
6.Research progress in mutation effect prediction based on protein language models.
Liang ZHANG ; Pan TAN ; Liang HONG
Chinese Journal of Biotechnology 2025;41(3):934-948
Predicting protein mutation effects is a key challenge in bioinformatics and protein engineering. Recent advancements in deep learning, particularly the development of protein language models (PLMs), have brought new opportunities to this field. This review summarizes the application of PLMs in predicting protein mutation effects, focusing on three main types of models: sequence-based models, structure-based models, and models that combine sequence and structural information. We analyze in detail the principles, advantages, and limitations of these models and discuss the application of unsupervised and supervised learning in model training. Furthermore, this paper discusses the main challenges currently faced, including the acquisition of high-quality datasets and the handling of data noise. Finally, we look ahead to future research directions, including the application prospects of emerging technologies such as multimodal fusion and few-shot learning. This review aims to provide researchers with a comprehensive perspective to further advance the prediction of protein mutation effects.
Mutation
;
Proteins/chemistry*
;
Computational Biology/methods*
;
Deep Learning
;
Protein Engineering
7.Intelligent mining, engineering, and de novo design of proteins.
Cui LIU ; Zhenkun SHI ; Hongwu MA ; Xiaoping LIAO
Chinese Journal of Biotechnology 2025;41(3):993-1010
Natural components serve the survival instincts of cells that are obtained through long-term evolution, while they often fail to meet the demands of engineered cells for efficiently performing biological functions in special industrial environments. Enzymes, as biological catalysts, play a key role in biosynthetic pathways, significantly enhancing the rate and selectivity of biochemical reactions. However, the catalytic efficiency, stability, substrate specificity, and tolerance of natural enzymes often fall short of industrial production requirements. Therefore, exploring and modifying enzymes to suit specific biomanufacturing processes has become crucial. In recent years, artificial intelligence (AI) has played an increasingly important role in the discovery, evaluation, engineering, and de novo design of proteins. AI can accelerate the discovery and optimization of proteins by analyzing large amounts of bioinformatics data and predicting protein functions and characteristics by machine learning and deep learning algorithms. Moreover, AI can assist researchers in designing new protein structures by simulating and predicting their performance under different conditions, providing guidance for protein design. This paper reviews the latest research advances in protein discovery, evaluation, engineering, and de novo design for biomanufacturing and explores the hot topics, challenges, and emerging technical methods in this field, aiming to provide guidance and inspiration for researchers in related fields.
Protein Engineering/methods*
;
Artificial Intelligence
;
Proteins/genetics*
;
Computational Biology
;
Machine Learning
;
Data Mining
;
Algorithms
;
Deep Learning
8.pLM4ACP: a model for predicting anticancer peptides based on machine learning and protein language models.
Yitong LIU ; Wenxin CHEN ; Juanjuan LI ; Xue CHI ; Xiang MA ; Yanqiong TANG ; Hong LI
Chinese Journal of Biotechnology 2025;41(8):3252-3261
Cancer is a serious global health problem and a major cause of human death. Conventional cancer treatments often run the risk of impairing vital organ functions. Anticancer peptides (ACPs) are considered to be one of the most promising therapeutic agents against common human cancers due to their small sizes, high specificity, and low toxicity. Since ACP recognition is highly limited to the laboratory, expensive, and time-consuming, we proposed pLM4ACP, a model for predicting ACPs based on machine learning and protein language models. In this model, the protein language model ProtT5 was used to extract the features of ACPs, and the extracted features were input into the support vector machine (SVM) classification algorithm for optimization and performance evaluation. The model showcased significantly higher accuracy than other methods, with the overall accuracy of 0.763, F1-score of 0.767, Matthews correlation coefficient of 0.527, and area under the curve of 0.827 on the independent test set. This study constructs an efficient anticancer peptide prediction model based on protein language models, further advancing the application of artificial intelligence in the biomedical field and promoting the development of precision medicine and computational biology.
Machine Learning
;
Antineoplastic Agents/chemistry*
;
Humans
;
Peptides/chemistry*
;
Support Vector Machine
;
Algorithms
;
Computational Biology/methods*
;
Neoplasms/drug therapy*
9.Preliminary exploration of multi-omics data fusion methods for high-dimensional small-sample datasets in traditional Chinese medicine.
Nian WANG ; Cheng-Cheng YU ; Hu YANG ; Zhong WANG ; Jun LIU
China Journal of Chinese Materia Medica 2025;50(1):278-284
With the advancement in big data and artificial intelligence technologies, the extensive application of omics technologies in traditional Chinese medicine(TCM) research has generated large experimental datasets, enabling the exploration of cross-scale correlations among massive data and thereby resulting in the shift toward a data-intensive research paradigm. The emerging approach of multi-omics data fusion analysis, emphasizing technical and computational tools, presents a potential breakthrough in this field. The holistic perspective of TCM aligns with the concept of multi-omics data fusion, yet the data types encountered exhibit high dimensionality with small sample sizes, necessitating data processing techniques such as dimensionality reduction. The current challenge lies in selecting suitable analytical methods for these data to enhance the systematic understanding of physiological functions and disease diagnosis/treatment processes. This paper explores the theories and frameworks of multi-omics data fusion, analyzes methods for fusing high-dimensional, small-sample multi-omics data in TCM, and aims to provide insights for advancing TCM research.
Medicine, Chinese Traditional/methods*
;
Humans
;
Computational Biology/methods*
;
Genomics/methods*
;
Sample Size
;
Artificial Intelligence
;
Multiomics
10.Prediction of immunotherapy targets for chronic cerebral hypoperfusion by bioinformatics method.
Mei ZHAO ; Yanpeng XUE ; Qingqing TIAN ; He YANG ; Qing JIANG ; Mengfan YU ; Xin CHEN
Journal of Biomedical Engineering 2025;42(2):382-388
Chronic cerebral hypoperfusion (CCH) plays an important role in the occurrence and development of vascular dementia (VD). Recent studies have indicated that multiple stages of immune-inflammatory response are involved in the process of cerebral ischemia, drawing increasing attention to immune therapies for cerebral ischemia. This study aims to identify potential immune therapeutic targets for CCH using bioinformatics methods from an immunological perspective. We identified a total of 823 differentially expressed genes associated with CCH, and further screened for 9 core immune-related genes, namely RASGRP1, FGF12, SEMA7A, PAK6, EDN3, BPHL, FCGRT, HSPA1B and MLNR. Gene enrichment analysis showed that core genes were mainly involved in biological functions such as cell growth, neural projection extension, and mesenchymal stem cell migration. Biological signaling pathway analysis indicated that core genes were mainly involved in the regulation of T cell receptor, Ras and MAPK signaling pathways. Through LASSO regression, we identified RASGRP1 and BPHL as key immune-related core genes. Additionally, by integrating differential miRNAs and the miRwalk database, we identified miR-216b-5p as a key immune-related miRNA that regulates RASGRP1. In summary, the predicted miR-216b-5p/ RASGRP1 signaling pathway plays a significant role in immune regulation during CCH, which may provide new targets for immune therapy in CCH.
Humans
;
Computational Biology/methods*
;
Brain Ischemia/therapy*
;
Immunotherapy
;
MicroRNAs/genetics*
;
Signal Transduction
;
Dementia, Vascular/genetics*
;
Chronic Disease

Result Analysis
Print
Save
E-mail