1.Development and validation of PhenoRAG: A visualization tool for automated human phenotype ontology term annotation based on large language models and retrieval-augmented generation technology.
Wei ZHONG ; Yousheng YAN ; Kai YANG ; Yan LIU ; Xinyu FU ; Zhengyang YAO ; Chenghong YIN
Chinese Journal of Medical Genetics 2026;43(1):36-43
OBJECTIVE:
To develop a user-friendly visualization application for the automatic annotation of Human Phenotype Ontology (HPO) terms based on large language models and retrieval-augmented generation (RAG) technology, and to validate its performance in an authoritative case dataset.
METHODS:
By integrating the domestic open-source large language model DeepSeek-V3 with RAG technology, an interactive web application was deployed on the Streamlit cloud platform. Using only the latest official HPO dataset as the data source, the lightweight sentence-embedding model BAAI/bge-small-en-v1.5 was employed to construct a FAISS vector index. During the online phase, a four-step closed-loop process is automatically completed: multilingual translation, phenotype phrase extraction, RAG candidate retrieval, term mapping, and official database validation. 121 English case reports publicly released by BMJ Case Reports and Oxford Medical Case Reports (with a gold-standard HPO set of 1 794 terms) were selected for application validation. Precision, recall, and F1 score were calculated and compared horizontally with traditional dictionary tools, standalone large language models, and the similar application "RAG-HPO". Finally, replace the model with the more advanced ChatGPT-5 and evaluate its performance on the newly extracted dataset.
RESULTS:
An HPO term automatic annotation visualization application named PhenoRAG, based on large language models and RAG technology, was successfully developed. Users can access it directly via a web link. Across the 112 cases, a total of 2 150 HPO terms were generated; 2,064 (96.0%) were fully validated by the official database, with a hallucination rate of 1.3% and an HPO ID-name mismatch rate of 2.7%. After deduplication, 1,906 terms remained for testing. The overall precision was 63.65%, recall was 67.34%, and F1 was 65.44%, significantly outperforming traditional annotation tools (F1: 0.45-0.49, P < 0.001). Although PhenoRAG's F1 was lower than that of RAG-HPO (F1 = 0.78, P < 0.001), which relies on a manually constructed synonym database of 54 000 entries plus the HPO dataset, it requires no additional dictionary maintenance and can be used without any background in computer programming. Moreover, after switching to the GPT-5 model, PhenoRAG exhibited no hallucination rate on the new dataset, and its F1 score significantly increased (P = 0.038).
CONCLUSION
Without constructing a synonym database, the PhenoRAG achieved high-accuracy automatic mapping from clinical text to standard HPO terms. It features a low usage threshold, free access, and a Chinese-language interface, and can directly serve rare disease diagnosis, genetic counseling, and research scenarios in China and worldwide, warranting further clinical promotion and multicenter validation.
Humans
;
Phenotype
;
Biological Ontologies
;
Language
;
Software
;
Large Language Models
2.External ocular manifestations among patients diagnosed with Coronavirus disease 2019 in a referral center in the Philippines.
Alyssa Louise B. Pejana-Paulino ; Aramis B. Torrefranca Jr. ; Nilo Vincent DG. Florcruz ; Ma. Dominga B. Padilla
Acta Medica Philippina 2026;60(1):69-77
BACKGROUND AND OBJECTIVES
data-mce-style="text-align: justify;">The global pandemic caused by Coronavirus Disease 2019 (COVID-19) has affected millions, with growing evidence of the potential role of ocular tissues in viral transmission. At the time of writing, local data regarding the phenomenon was limited. This study investigated external ocular manifestations in patients with COVID-19 at a referral center in the Philippines, examined correlations between demographics, systemic manifestations, and laboratory results with ocular manifestations, and determined their timing relative to systemic symptoms.
METHODSdata-mce-style="text-align: justify;">This single-center, descriptive cross-sectional study was carried out from December 8 to 18, 2020 at the adult COVID-19 wards of the Philippine General Hospital involving 72 participants. Data collection involved relevant clinical history taking and performing gross eye examination. The prevalence of ocular manifestations was described with 95% confidence intervals. Correlations between ocular manifestations and quantitative variables were analyzed with point-biserial correlation, and associations with qualitative variables were tested using chi-square or Fisher’s exact tests.
RESULTSdata-mce-style="text-align: justify;">Among participants, 31.9% presented with ocular manifestations with foreign body sensation as the most prevalent ocular symptom (11.1%) and conjunctival hyperemia as the most prevalent ocular finding (19.4%). The median age of patients with ocular manifestations was 41 years old with a higher prevalence in the male population (73.9%, CI=95%, p=0.001). No significant correlation was observed between presence of external ocular manifestations and the different systemic and ocular co-morbidities as well as with COVID-19 clinical classification. Among those who experienced symptoms, majority (29.2%) of the patients experienced systemic symptoms prior to the onset of ocular symptoms. Ocular complaints may present as the sole manifestation (13.9%). Several laboratory parameters were measured and only temperature and AST levels showed a low positive correlation with the presence of ocular manifestations.
CONCLUSIONdata-mce-style="text-align: justify;">Ocular manifestations occur in roughly one third of patients with COVID-19 based on this study population. With some individuals presenting with ocular signs or symptoms as the initial and sole manifestation, healthcare practitioners must exercise caution and remain vigilant in managing patients who present as such. At the time of writing, this is the first local study investigating the different external ocular manifestations in patients with COVID-19. There is a need to pursue more robust studies and conduct more local investigations which will guide both ophthalmologists and other practitioners in strengthening existing guidelines regarding precautionary practices, clinical diagnosis, and management of COVID-19 patients.
Human ; Sars-cov-2 ; Covid-19 ; Philippines ; Adult ; Association ; Classification ; Collection ; Confidence Intervals ; Coronavirus ; Cross-sectional Studies ; Data Collection ; Demography ; Diagnosis ; Disease ; Exercise ; Eye ; Foreign Bodies ; History ; Hospitals ; Hospitals, General ; Hyperemia ; Laboratories ; Male ; Morbidity ; Ophthalmologists ; Pandemics ; Patients ; Population ; Prevalence ; Referral And Consultation ; Role ; Sensation ; Temperature ; Time ; Tissues ; Volition ; World Health Organization ; Writing
4.Analysis of the ontology construction approach to acupoint anatomy.
Wenwen LIU ; Xianghong JING ; Feng YANG
Chinese Acupuncture & Moxibustion 2025;45(5):694-702
Through the investigation of relevant literature, the concepts, methods, languages and tools of ontology were explored, and the suitable methods and tools for the ontology construction of acupoint anatomy were selected. The current mainstream anatomical ontology and related ontology of TCM were investigated so as to provide the reference for the ontology construction of acupoint anatomy. According to the knowledge attributes of acupoint anatomy, the foundational model of anatomy (FMA) was served as the reusable ontology, and in association with the attribute classification of traditional Chinese medicine language system (TCMLS), the construction approach to acupoint anatomical ontology was explored. By taking "anatomical entity of acupoints" as the top-level concept, the demonstrative study on the anatomical ontology construction was conducted on the acupoints of lung meridian of hand-taiyin.
Acupuncture Points
;
Humans
;
Meridians
;
Medicine, Chinese Traditional
;
Biological Ontologies
5.Expert consensus on early orthodontic treatment of class III malocclusion.
Xin ZHOU ; Si CHEN ; Chenchen ZHOU ; Zuolin JIN ; Hong HE ; Yuxing BAI ; Weiran LI ; Jun WANG ; Min HU ; Yang CAO ; Yuehua LIU ; Bin YAN ; Jiejun SHI ; Jie GUO ; Zhihua LI ; Wensheng MA ; Yi LIU ; Huang LI ; Yanqin LU ; Liling REN ; Rui ZOU ; Linyu XU ; Jiangtian HU ; Xiuping WU ; Shuxia CUI ; Lulu XU ; Xudong WANG ; Songsong ZHU ; Li HU ; Qingming TANG ; Jinlin SONG ; Bing FANG ; Lili CHEN
International Journal of Oral Science 2025;17(1):20-20
The prevalence of Class III malocclusion varies among different countries and regions. The populations from Southeast Asian countries (Chinese and Malaysian) showed the highest prevalence rate of 15.8%, which can seriously affect oral function, facial appearance, and mental health. As anterior crossbite tends to worsen with growth, early orthodontic treatment can harness growth potential to normalize maxillofacial development or reduce skeletal malformation severity, thereby reducing the difficulty and shortening the treatment cycle of later-stage treatment. This is beneficial for the physical and mental growth of children. Therefore, early orthodontic treatment for Class III malocclusion is particularly important. Determining the optimal timing for early orthodontic treatment requires a comprehensive assessment of clinical manifestations, dental age, and skeletal age, and can lead to better results with less effort. Currently, standardized treatment guidelines for early orthodontic treatment of Class III malocclusion are lacking. This review provides a comprehensive summary of the etiology, clinical manifestations, classification, and early orthodontic techniques for Class III malocclusion, along with systematic discussions on selecting early treatment plans. The purpose of this expert consensus is to standardize clinical practices and improve the treatment outcomes of Class III malocclusion through early orthodontic treatment.
Humans
;
Malocclusion, Angle Class III/classification*
;
Orthodontics, Corrective/methods*
;
Consensus
;
Child
6.Gene print-based cell subtypes annotation of human disease across heterogeneous datasets with gPRINT.
Ruojin YAN ; Chunmei FAN ; Shen GU ; Tingzhang WANG ; Zi YIN ; Xiao CHEN
Protein & Cell 2025;16(8):685-704
Identification of disease-specific cell subtypes (DSCSs) has profound implications for understanding disease mechanisms, preoperative diagnosis, and precision therapy. However, achieving unified annotation of DSCSs in heterogeneous single-cell datasets remains a challenge. In this study, we developed the gPRINT algorithm (generalized approach for cell subtype identification with single cell's voicePRINT). Inspired by the principles of speech recognition in noisy environments, gPRINT transforms gene position and gene expression information into voiceprints based on ordered and clustered gene expression phenomena, obtaining unique "gene print" patterns for each cell. Then, we integrated neural networks to mitigate the impact of background noise on cell identity label mapping. We demonstrated the reproducibility of gPRINT across different donors, single-cell sequencing platforms, and disease subtypes, and its utility for automatic cell subtype annotation across datasets. Moreover, gPRINT achieved higher annotation accuracy of 98.37% when externally validated based on the same tissue, surpassing other algorithms. Furthermore, this approach has been applied to fibrosis-associated diseases in multiple tissues throughout the body, as well as to the annotation of fibroblast subtypes in a single tissue, tendon, where fibrosis is prevalent. We successfully achieved automatic prediction of tendinopathy-specific cell subtypes, key targets, and related drugs. In summary, gPRINT provides an automated and unified approach for identifying DSCSs across datasets, facilitating the elucidation of specific cell subtypes under different disease states and providing a powerful tool for exploring therapeutic targets in diseases.
Humans
;
Algorithms
;
Single-Cell Analysis
;
Databases, Genetic
;
Molecular Sequence Annotation
7.Research progress on the classification of sepsis and sepsis-related organ dysfunction.
Chinese Critical Care Medicine 2025;37(4):402-406
Sepsis is a life-threatening organ dysfunction syndrome caused by a dysregulated host response to infection. Due to different infection sources, pathogens and basic conditions of patients, there is significant heterogeneity in clinical manifestations, response to treatment and prognosis of patients with sepsis. Accurate classification and individualized treatment of sepsis will help to further improve the prognosis of patients with sepsis. In recent years, the integration of artificial intelligence and bioinformatics has brought new opportunities for the research of sepsis classification. This review systematically introduces a variety of sepsis classification methods and their clinical application value. The clinical data in the electronic medical record, such as the dynamic changes of vital signs such as body temperature, can be used as the basis for sepsis classification. Different subtypes of body temperature trajectories have differences in physiological characteristics and prognosis, which contributes to predict the prognosis of patients and guide fluid management strategies. Biomarker classification can more comprehensively reflect the pathophysiological state of patients. Immune index classification is helpful to identify immunocompromised patients so as to carry out targeted immunotherapy. Transcriptome data and genotyping reveal the heterogeneity of sepsis at the molecular level and provide a new perspective for precision medicine. In addition, a detailed systematic review of sepsis-related organ function damage, such as acute respiratory distress syndrome (ARDS), acute kidney injury (AKI), and acute liver injury, has also been conducted, which is helpful to develop targeted organ protection and treatment strategies. These typing methods have shown good application prospects in clinical practice. However, there are still limitations in the current research, such as typing stability and biomarker selection, which need to be further explored. Future research should focus on the development of stable and efficient typing tools to achieve precise treatment of sepsis and improve the prognosis of patients.
Humans
;
Sepsis/classification*
;
Multiple Organ Failure/classification*
;
Prognosis
;
Artificial Intelligence
;
Biomarkers
;
Computational Biology
;
Respiratory Distress Syndrome
8.Characteristics, microbial composition, and mycotoxin profile of fermented traditional Chinese medicines.
Hui-Ru ZHANG ; Meng-Yue GUO ; Jian-Xin LYU ; Wan-Xuan ZHU ; Chuang WANG ; Xin-Xin KANG ; Jiao-Yang LUO ; Mei-Hua YANG
China Journal of Chinese Materia Medica 2025;50(1):48-57
Fermented traditional Chinese medicine(TCM) has a long history of medicinal use, such as Sojae Semen Praeparatum, Arisaema Cum Bile, Pinelliae Rhizoma Fermentata, red yeast rice, and Jianqu. Fermentation technology was recorded in the earliest TCM work, Shen Nong's Classic of the Materia Medica. Microorganisms are essential components of the fermentation process. However, the contamination of fermented TCM by toxigenic fungi and mycotoxins due to unstandardized fermentation processes seriously affects the quality of TCM and poses a threat to the life and health of consumers. In this paper, the characteristics, microbial composition, and mycotoxin profile of fermented TCM are systematically summarized to provide a theoretical basis for its quality and safety control.
Fermentation
;
Mycotoxins/analysis*
;
Drugs, Chinese Herbal/analysis*
;
Fungi/classification*
;
Bacteria/genetics*
;
Drug Contamination
;
Medicine, Chinese Traditional
9.Identification and functional analysis of β-amyrin synthase gene in Dipsacus asper.
Huan LEI ; Hua HE ; Jiao XU ; Chang-Gui YANG ; Wei-Ke JIANG ; Tao ZHOU ; Lan-Ping GUO
China Journal of Chinese Materia Medica 2025;50(4):1043-1050
Dipsaci Radix is a commonly used Chinese herbal medicine in China, with triterpenoid saponins as the main active components. β-Amyrin synthase, a member of the oxidosqualene cyclase superfamily, plays a crucial role in the biosynthesis of oleanane-type triterpenoid saponins. Asperosaponin Ⅵ is an oleanane-type triterpenoid saponin. To explore the β-amyrin synthase genes involved in the biosynthesis of asperosaponin Ⅵ in Dipsacus asper, this study screened the candidate genes from the transcriptome data of D. asper. Two β-amyrin synthase genes, Da OSC1 and Da OSC2, were identified by phylogenetic analysis and correlation analysis. The coding sequences of Da OSC1 and Da OSC2 were 2 286 bp and 2 295 bp in length, encoding 761 and 764 amino acids,respectively. Multiple sequence alignments showed that Da OSC1 and Da OSC2 had three conserved motifs( DCTAE, QW, and MWCYCR) unique to the oxidosqualene cyclase family. Real-time quantitative PCR results showed that Da OSC1 and Da OSC2 had the highest expression levels in the roots. Compared with normal growth conditions, the low-temperature treatment significantly upregulated the expression of Da OSC1 and Da OSC2. Agrobacterium-mediated transient expression of Da OSC1 and Da OSC2 in Nicotiana benthamiana resulted in the production of β-amyrin, which suggested that Da OSC1 and Da OSC2 were able to catalyze the synthesis of β-amyrin. This study clarified the catalytic functions of two β-amyrin synthases in D. asper, analyzed their expression patterns in different tissue and at low temperatures. The findings provide a foundation for further studying the biosynthetic pathway and regulatory mechanism of asperosaponin Ⅵ in D. asper.
Intramolecular Transferases/chemistry*
;
Phylogeny
;
Plant Proteins/chemistry*
;
Gene Expression Regulation, Plant
;
Dipsacaceae/classification*
;
Saponins/metabolism*
;
Oleanolic Acid/metabolism*
10.Multi-gene molecular identification and pathogenicity analysis of pathogens causing root rot of Atractylodes lancea in Hubei province.
Tie-Lin WANG ; Yang XU ; Xiu-Fu WAN ; Zhao-Geng LYU ; Bin-Bin YAN ; Yong-Xi DU ; Chuan-Zhi KANG ; Lan-Ping GUO
China Journal of Chinese Materia Medica 2025;50(7):1721-1726
To clarify the species, pathogenicity, and distribution of the pathogens causing the root rot of Atractylodes lancea in Hubei province, the tissue separation method was used to isolate the pathogens from root rot samples in the main planting areas of A. lancea in Hubei. Based on the preliminary identification of the Fusarium genus by the internal transcribed spacer(ITS) sequence, three housekeeping genes, EF1/EF2, Btu-F-FO1/Btu-F-RO1, and FF1/FR1, were amplified and sequenced. Subsequently, a phylogenetic tree was constructed based on these TEF gene sequences to classify the pathogens. The pathogenicity of these strains was determined using the root irrigation method. A total of 194 pathogen strains were isolated using the tissue separation method. Molecular identification using the three housekeeping genes identified the pathogens as F. solani, F. oxysporum, F. commune, F. equiseti, F. tricinctum, F. redolens, F. fujikuroi, F. avenaceum, F. acuminatum, and F. incarnatum. Among them, F. solani and F. oxysporum were the dominant strains, widely distributed in multiple regions, with F. solani accounting for approximately 54% of the total isolated strains and F. oxysporum accounting for approximately 34%. Other strains accounted for a relatively small proportion, totaling approximately 12%. The results of pathogenicity determination showed that there were certain differences in pathogenicity among strains. The analysis of the pathogenicity differentiation of the widely distributed F. solani and F. oxysporum strains revealed that these dominant strains in Hubei were mainly highly pathogenic. This study determined the species, pathogenicity, and distribution of the pathogens causing the root rot of A. lancea in Hubei province. The results provide a scientific basis for further understanding the root rot of A. lancea and its epidemic occurrence and scientifically preventing and controlling this disease.
Plant Diseases/microbiology*
;
Atractylodes/microbiology*
;
Phylogeny
;
Plant Roots/microbiology*
;
Fusarium/classification*
;
China
;
Virulence
;
Fungal Proteins/genetics*


Result Analysis
Print
Save
E-mail