1.Development and validation of PhenoRAG: A visualization tool for automated human phenotype ontology term annotation based on large language models and retrieval-augmented generation technology.
Wei ZHONG ; Yousheng YAN ; Kai YANG ; Yan LIU ; Xinyu FU ; Zhengyang YAO ; Chenghong YIN
Chinese Journal of Medical Genetics 2026;43(1):36-43
OBJECTIVE:
To develop a user-friendly visualization application for the automatic annotation of Human Phenotype Ontology (HPO) terms based on large language models and retrieval-augmented generation (RAG) technology, and to validate its performance in an authoritative case dataset.
METHODS:
By integrating the domestic open-source large language model DeepSeek-V3 with RAG technology, an interactive web application was deployed on the Streamlit cloud platform. Using only the latest official HPO dataset as the data source, the lightweight sentence-embedding model BAAI/bge-small-en-v1.5 was employed to construct a FAISS vector index. During the online phase, a four-step closed-loop process is automatically completed: multilingual translation, phenotype phrase extraction, RAG candidate retrieval, term mapping, and official database validation. 121 English case reports publicly released by BMJ Case Reports and Oxford Medical Case Reports (with a gold-standard HPO set of 1 794 terms) were selected for application validation. Precision, recall, and F1 score were calculated and compared horizontally with traditional dictionary tools, standalone large language models, and the similar application "RAG-HPO". Finally, replace the model with the more advanced ChatGPT-5 and evaluate its performance on the newly extracted dataset.
RESULTS:
An HPO term automatic annotation visualization application named PhenoRAG, based on large language models and RAG technology, was successfully developed. Users can access it directly via a web link. Across the 112 cases, a total of 2 150 HPO terms were generated; 2,064 (96.0%) were fully validated by the official database, with a hallucination rate of 1.3% and an HPO ID-name mismatch rate of 2.7%. After deduplication, 1,906 terms remained for testing. The overall precision was 63.65%, recall was 67.34%, and F1 was 65.44%, significantly outperforming traditional annotation tools (F1: 0.45-0.49, P < 0.001). Although PhenoRAG's F1 was lower than that of RAG-HPO (F1 = 0.78, P < 0.001), which relies on a manually constructed synonym database of 54 000 entries plus the HPO dataset, it requires no additional dictionary maintenance and can be used without any background in computer programming. Moreover, after switching to the GPT-5 model, PhenoRAG exhibited no hallucination rate on the new dataset, and its F1 score significantly increased (P = 0.038).
CONCLUSION
Without constructing a synonym database, the PhenoRAG achieved high-accuracy automatic mapping from clinical text to standard HPO terms. It features a low usage threshold, free access, and a Chinese-language interface, and can directly serve rare disease diagnosis, genetic counseling, and research scenarios in China and worldwide, warranting further clinical promotion and multicenter validation.
Humans
;
Phenotype
;
Biological Ontologies
;
Language
;
Software
;
Large Language Models
2.Application of artificial intelligence-assisted chromosome karyotyping analysis in prenatal diagnosis of chromosomal mosaicism.
Ling ZHAO ; Shiwei SUN ; Qinghua ZHENG ; Qing YU ; Chongyang ZHU ; Ling LIU ; Yueli WU
Chinese Journal of Medical Genetics 2026;43(3):180-187
OBJECTIVE:
To explore the application value of artificial intelligence (AI)-assisted chromosomal karyotype analysis in the diagnosis of prenatal chromosomal mosaicism.
METHODS:
A retrospective analysis was conducted on 172 pregnant women who underwent amniocentesis at the Department of Medical Genetics and Prenatal Diagnosis, the Third Affiliated Hospital of Zhengzhou University between January 2019 and December 2024. All cases whose fetuses were diagnosed with chromosomal mosaicism via karyotype analysis and stratified into two groups based on the analytical software employed: the conventional analysis group (n = 70), which utilized Leica analysis software for karyotype image recognition and cell counting; and the AI-assisted analysis group (n = 102), which utilized AI-assisted software for the same procedures. The clinical performance of AI-assisted karyotype analysis in diagnosing chromosomal mosaicism was comprehensively evaluated by comparing the types of mosaic karyotypes, distribution of mosaic ratios, and verification outcomes of different detection modalities between the two groups. This study was approved by the Medical Ethics Committee of the Third Affiliated Hospital of Zhengzhou University (Ethics No.: 2024-406-01).
RESULTS:
No statistically significant difference was observed in baseline characteristics (maternal age, gestational week, and indications for prenatal diagnosis) between the two groups. Regarding the detection efficacy for numerical and structural mosaicisms, no significant difference was found in the detection of numerical mosaicism. However, the conventional analysis group exhibited a significantly higher detection rate of autosomal structural mosaicism compared to the AI-assisted group (11.43% vs. 0.98%, P < 0.05). Numerical mosaicism cases were further verified using copy number variation sequencing (CNV-seq) and/or fluorescence in situ hybridization (FISH). The AI-assisted group demonstrated a significantly lower inconsistency rate (5.56% vs. 20.41%, P < 0.05) compared to the conventional group. For low-proportion (< 10%) chromosomal mosaicism, the AI-assisted group had a significantly lower detection rate (13.25% vs. 29.69%, P < 0.05). Subsequent validation of low-proportion mosaicism by CNV-seq and/or FISH showed a higher consistency rate in the AI-assisted group (81.82% vs. 54.55%), though the difference did not reach statistical significance (P = 0.360).
CONCLUSION
For the karyotyping analysis of prenatal chromosomal mosaicism, AI-assisted karyotype analysis shows high accuracy and consistency in identifying numerical chromosomal mosaicism, particularly in reducing the detection of low-proportion (< 10%) mosaicism while improving verification accuracy. AI-assisted analysis can significantly improve the detection accuracy of numerical mosaicism and mitigate the risk of misclassification for low-proportion (< 10%) mosaicism, thereby providing more precise clinical evidence for the prenatal diagnosis of chromosomal mosaicisms.
Humans
;
Female
;
Mosaicism
;
Pregnancy
;
Karyotyping/methods*
;
Artificial Intelligence
;
Prenatal Diagnosis/methods*
;
Adult
;
Retrospective Studies
;
Chromosome Disorders/genetics*
;
Amniocentesis
3.ACTA at the crossroads.
Acta Medica Philippina 2026;60(1):5-6
Academic publishing is at a critical juncture. The challenges faced by the academics are mired in controversy. Among theseare three hotly debated concerns. First is the issue of whether technological innovations such as artificial intelligence (AI)improves research efficiency or if its use sacrifices research integrity.Another is the controversy between paywall publishingand open access. Lastly, adapting an appropriate business model for sustainability is a contentious issue and the choice betweena commercial or a university-based publishing platform is a difficult one.
Traditional models of scientific investigation relied on tedious intellectual calisthenics in all aspects of research —identifying research gaps, reviewing of published literature, devising valid methodology, collecting data, analysing results, and,finally, drawing conclusions. With the advent of powerful tools employing artificial intelligence, these heavy tasks are efficientlycarried out. The dilemma lies in determining which parts of the work can be attributed to the authors and which are ascribedto the output of large language models (LLMs) and other automated assistance employed.Despite requiring adequate vettingby experts of these AI-aided output, many in the scientific community still question these methods. Can research employingAI be considered honest work? Will full disclosure answer doubts as to the integrity of the scientific work?
Indeed, LLMs just gather information that is already out there, albeit more efficiently. After all, science progresses bystanding on the shoulder of giants. AI makes such work comprehensive and efficient. Standing on those proverbial shoulders,however, require access to prior work, hence our next challenge in academic publishing--open access versus paid access.Paywalls limit the benefits of valuable research to institutions and universities with the capacity to pay. Excluded from these arethose from low resourced countries, with nations from the global south being affected disproportionately. Additionally, whilenumerous authors appreciate the features of open access as it improves their impact and visibility, many feel unduly burdenedsince the cost of publishing in this format is passed on to them.
This brings us to our third issue: who bears the cost of academic publishing? Indeed, it is a lucrative industry, generatingan annual revenue of US$19 billion and an estimated 40 percent profit margin. Many, however, find fault in this businessmodel as concerns about the profit motives of the commercial publishers far overshadow their sustainability goals.
How do we navigate this landscape of controversies? We, at the ACTA, as part of the community of scholars, would needto clarify our mission. Our goals for this publication should be consistent with our values. These values, such as scientific rigor,integrity, and accountability, should be reflected in our policies. We should be cognizant of the role we play in national scientificdiscourse while we endeavor to make an impact in the global scene. We are accountable to our stakeholders — nurturingearly career scholars, supplying evidence to health policymakers, and being accountable to those who provide resources tosustain us. This stewardship is essential so that ACTA will stand shoulder to shoulder with the giants on which science buildsupon to benefit future generations.
Artificial Intelligence ; Commerce ; Costs And Cost Analysis ; Disclosure ; Drawing ; Efficiency ; Family Characteristics ; Forecasting ; Goals ; Gymnastics ; Health ; Health Resources ; Industry ; Intelligence ; Inventions ; Language ; Literature ; Methods ; Play And Playthings ; Policy ; Publications ; Publishing ; Research ; Residence Characteristics ; Role ; Science ; Shoulder ; Social Responsibility ; Universities ; Ursidae ; Volition ; Work ; World Health Organization
4.Revolutionizing pathology in the Philippines.
Philippine Journal of Pathology 2025;10(2):52-62
Artificial Intelligence (AI) is transforming the landscape of pathology, particularly in resource-constrained settings like the Philippines. This narrative review explores the applications, challenges, and future potential of AI in digital image analysis for pathology practices. By synthesizing peer-reviewed literature from 2019 to 2024, the review highlights the role of machine learning (ML) and deep learning (DL) algorithms in enhancing diagnostic accuracy, workflow efficiency, and clinical decision-making. AI-driven tools such as convolutional neural networks (CNNs) and transfer learning models have demonstrated significant success in tumor detection, biomarker evaluation, and predictive analytics, paving the way for personalized medicine. However, barriers such as limited annotated datasets, privacy concerns, and model interpretability hinder widespread adoption. The review emphasizes the need for ethical frameworks, workforce training, and infrastructure development to ensure equitable and effective integration of AI into pathology practices. By addressing these challenges, AI has the potential to improve diagnostic precision, expand access to healthcare, and modernize pathology services in the Philippines.
Human ; Artificial Intelligence ; Pathology ; Philippines ; Deep Learning ; Machine Learning
5.Application of machine learning algorithms in predicting new onset hypertension: a study based on the China Health and Nutrition Survey.
Manhui ZHANG ; Xian XIA ; Qiqi WANG ; Yue PAN ; Guanyi ZHANG ; Zhigang WANG
Environmental Health and Preventive Medicine 2025;30():3-3
BACKGROUND:
Hypertension is a serious chronic disease that can significantly lead to various cardiovascular diseases, affecting vital organs such as the heart, brain, and kidneys. Our goal is to predict the risk of new onset hypertension using machine learning algorithms and identify the characteristics of patients with new onset hypertension.
METHODS:
We analyzed data from the 2011 China Health and Nutrition Survey cohort of individuals who were not hypertensive at baseline and had follow-up results available for prediction by 2015. We tested and evaluated the performance of four traditional machine learning algorithms commonly used in epidemiological studies: Logistic Regression, Support Vector Machine, XGBoost, LightGBM, and two deep learning algorithms: TabNet and AMFormer model. We modeled using 16 and 29 features, respectively. SHAP values were applied to select key features associated with new onset hypertension.
RESULTS:
A total of 4,982 participants were included in the analysis, of whom 1,017 developed hypertension during the 4-year follow-up. Among the 16-feature models, Logistic Regression had the highest AUC of 0.784(0.775∼0.806). In the 29-feature prediction models, AMFormer performed the best with an AUC of 0.802(0.795∼0.820), and also scored the highest in MCC (0.417, 95%CI: 0.400∼0.434) and F1 (0.503, 95%CI: 0.484∼0.505) metrics, demonstrating superior overall performance compared to the other models. Additionally, key features selected based on the AMFormer, such as age, province, waist circumference, urban or rural location, education level, employment status, weight, WHR, and BMI, played significant roles.
CONCLUSION
We used the AMFormer model for the first time in predicting new onset hypertension and achieved the best results among the six algorithms tested. Key features associated with new onset hypertension can be determined through this algorithm. The practice of machine learning algorithms can further enhance the predictive efficacy of diseases and identify risk factors for diseases.
Humans
;
China/epidemiology*
;
Hypertension/diagnosis*
;
Machine Learning
;
Male
;
Female
;
Middle Aged
;
Adult
;
Nutrition Surveys
;
Algorithms
;
Aged
;
Risk Factors
6.Personalized mandibular reconstruction assisted by three-dimensional retrieval model based on fully connected neural network and a database of mandibles.
Shiyu QIU ; Yang LIAN ; Yifan KANG ; Lei ZHANG ; Yiwang CAI ; Xiaofeng SHAN ; Zhigang CAI
Journal of Peking University(Health Sciences) 2025;57(2):360-368
OBJECTIVE:
To propose a new protocol for personalized mandibular reconstruction assisted by three-dimensional (3D) retrieval model based on fully connected neural network (FCNN) and a database of mandibles, and to verify clinical feasibility of the protocol.
METHODS:
A database of mandibles of 300 normal northern Chinese Han people was established. On the basis of cephalometry, the mandible landmarks with good stability were further screened. Mandibular landmarks were selected and geometric features of the mandible were extracted. A 3D retrieval algorithm was developed, which could retrieve the mandible most similar to a given mandible from the database. A FCNN was built to train the algorithm to improve accuracy of the 3D retrieval model. Using Geomagic Control 2014 software, matching accuracy of the 3D retrieval model was based on aforementioned mandible database and algorithm. From December 2019 to March 2021, a total of 5 patients underwent personalized mandibular reconstruction assisted by a 3D retrieval model based on mandible database and FCNN in the Department of Oral and Maxillofacial Surgery, Peking University School and Hospital of Stomatology. The most similar mandible was retrieved from mandible database through 3D retrieval algorithm. It was used to restore the premorbid morphology of defect area and guide mandibular reconstruction. For the 5 patients, mandible was reconstructed with iliac flap. Virtual surgical plan was transformed using individual surgical guides.
RESULTS:
Through screening, mandibular landmarks with high reproducibility and stability were identified and composed of mandibular landmarker protocols. After training, the average deviation between most similar mandible retrieved from the 300-case mandible database through 3D retrieval model based on FCNN and given mandible was (1.77±0.44) mm. And the root-mean-square deviation between the most similar mandible retrieved from the database and given mandible was (2.58±0.86) mm. The mandibular reconstruction surgery was successful in all the 5 patients. Their facial symmetry and occlusion were restored. All the patients were satisfied with postoperative appearance. The mean deviation between postoperative mandible and preoperative design was (0.98±0.17) mm. The area with a deviation ≤1 mm accounted for 61.34%±14. 13%, ≤2 mm accounted for 83.82%±7.35%, and ≤3 mm accounted for 93.94%± 2.87%.
CONCLUSION
The personalized mandibular reconstruction assisted by 3D retrieval model based on the 300-case mandible database and FCNN is feasible clinically.
Humans
;
Neural Networks, Computer
;
Mandibular Reconstruction/methods*
;
Mandible/diagnostic imaging*
;
Imaging, Three-Dimensional/methods*
;
Adult
;
Databases, Factual
;
Female
;
Male
;
Algorithms
;
Middle Aged
;
Cephalometry
7.Artificial intelligence in stomatology: Innovations in clinical practice, research, education, and healthcare management.
Xuliang DENG ; Mingming XU ; Chenlin DU
Journal of Peking University(Health Sciences) 2025;57(5):821-826
In recent years, China has continued to face a high prevalence of oral diseases, along with uneven access to high-quality dental care. Against this backdrop, artificial intelligence (AI), as a data-driven, algorithm-supported, and model-centered technology system, has rapidly expanded its role in transforming the landscape of stomatology. This review summarizes recent advances in the application of AI in stomatology across clinical care, biomedical and materials research, education, and hospital management. In clinical settings, AI has improved diagnostic accuracy, streamlined treatment planning, and enhanced surgical precision and efficiency. In research, machine learning has accelerated the identification of disease biomarkers, deepened insights into the oral microbiome, and supported the development of novel biomaterials. In education, AI has enabled the construction of knowledge graphs, facilitated personalized learning, and powered simulation-based training, driving innovation in teaching methodologies. Meanwhile, in hospital operations, intelligent agents based on large language models (LLMs) have been widely deployed for intelligent triage, structured pre-consultations, automated clinical documentation, and quality control, contributing to more standardized and efficient healthcare delivery. Building on these foundations, a multi-agent collaborative framework centered around an AI assistant for stomatology is gradually emerging, integrating task-specific agents for imaging, treatment planning, surgical navigation, follow-up prediction, patient communication, and administrative coordination. Through shared interfaces and unified knowledge systems, these agents support seamless human-AI collaboration across the full continuum of care. Despite these achievements, the broader deployment of AI still faces challenges including data privacy, model robustness, cross-institution generalization, and interpretability. Addressing these issues will require the development of federated learning frameworks, multi-center validation, causal reasoning approaches, and strong ethical governance. With these foundations in place, AI is poised to move from a supportive tool to a trusted partner in advancing accessible, efficient, and high-quality stomatology services in China.
Artificial Intelligence
;
Humans
;
Oral Medicine/trends*
;
China
;
Delivery of Health Care
;
Machine Learning
8.Automatic Bone Fracture Reduction Technique with Section Registration.
Qinhui YUAN ; Mengxing LIU ; Chu GUO ; Yukun AN ; Ping ZHOU
Chinese Journal of Medical Instrumentation 2025;49(1):1-7
As a fundamental aspect of bone fracture treatment, fracture reduction plays a decisive role in restoring the structural integrity and function of bones. At present, fracture reduction techniques mostly rely on semi-automatic interaction methods or healthy-side bone templates for registration, which have many limitations in clinical practice. In order to enhance treatment efficiency and accuracy, an automatic fracture reduction algorithm is proposed. This algorithm utilizes the similarity of fracture cross-sections for registration, thereby reducing the workload of physicians and eliminating the need for a healthy-side bone template. Initially, the closed edge is identified and extracted by analyzing the differences in the fracture surface and the calorific value diagram of the roughness distribution. Next, the fracture section is determined by using the identified closed edge as a guideline for regional expansion and similarity matching. During the registration phase, the iterative closest point (ICP) algorithm is highly sensitive to distance. Therefore, the geometric features of point clouds are incorporated into the objective function of the registration algorithm to mitigate the influence of noise, and fracture section registration is implemented one by one. Finally, the algorithm is tested and compared on 180 simulated datasets and 16 publicly available datasets. The results show that the proposed algorithm significantly improves the registration accuracy, and the registration error of clinical bone fracture cases is controlled within 1.7 mm.
Algorithms
;
Fractures, Bone/therapy*
;
Humans
9.Research Progress and Prospects of Minimally Invasive Surgical Instrument Segmentation Methods Based on Artificial Intelligence.
Weimin CHENG ; Xiaohua WU ; Jing XIONG
Chinese Journal of Medical Instrumentation 2025;49(1):15-23
With the development of artificial intelligence technology and the growing demand for minimally invasive surgery, the intelligentization of minimally invasive surgery has become a current research hotspot. Surgical instrument segmentation is a highly promising technology that can enhance the performance of minimally invasive endoscopic imaging systems, surgical video analysis systems, and other related systems. This article summarizes the semantic and instance segmentation methods of minimally invasive surgical instruments based on deep learning, deeply analyzes the supervision methods of training algorithms, network structure improvements, and attention mechanisms, and then discusses the methods based on the Segment Anything Model. Given that deep learning methods have extremely high requirements for data, current data augmentation methods have also been explored. Finally, a summary and outlook on instrument segmentation technology are provided.
Artificial Intelligence
;
Minimally Invasive Surgical Procedures/instrumentation*
;
Algorithms
;
Deep Learning
;
Humans
;
Image Processing, Computer-Assisted
10.Application Status of Machine Learning in Assisted Diagnosis Techniques of Cardiovascular Diseases.
Pinliang LIAO ; Zihong WANG ; Miao TIAN ; Hong CHAI ; Xiaoyu CHEN
Chinese Journal of Medical Instrumentation 2025;49(1):24-34
In recent years, cardiovascular disease has become a common disease. With the development of machine learning and big data technologies, the processing ability of electrocardiogram (ECG) signals has been greatly enhanced through new computer technologies, enabling the auxiliary diagnosis technology for cardiovascular disease (CVD) to achieve new improvements. This article discusses the application of machine learning in ECG processing, especially in the auxiliary diagnosis of diseases. Firstly, the conventional signal preprocessing methods are introduced, and then the EEG signal processing methods based on feature extraction and fuzzy classification are explored. Secondly, the application of auxiliary diagnosis in CVD is further summarized. Finally, the advantages and disadvantages of the two methods are analyzed, and based on this, a design of an auxiliary diagnostic system compatible with the two methods is proposed, providing a new perspective for similar applied researches in the future.
Machine Learning
;
Cardiovascular Diseases/diagnosis*
;
Humans
;
Electrocardiography
;
Signal Processing, Computer-Assisted
;
Diagnosis, Computer-Assisted
;
Fuzzy Logic
;
Electroencephalography


Result Analysis
Print
Save
E-mail