1.Development and validation of PhenoRAG: A visualization tool for automated human phenotype ontology term annotation based on large language models and retrieval-augmented generation technology.
Wei ZHONG ; Yousheng YAN ; Kai YANG ; Yan LIU ; Xinyu FU ; Zhengyang YAO ; Chenghong YIN
Chinese Journal of Medical Genetics 2026;43(1):36-43
OBJECTIVE:
To develop a user-friendly visualization application for the automatic annotation of Human Phenotype Ontology (HPO) terms based on large language models and retrieval-augmented generation (RAG) technology, and to validate its performance in an authoritative case dataset.
METHODS:
By integrating the domestic open-source large language model DeepSeek-V3 with RAG technology, an interactive web application was deployed on the Streamlit cloud platform. Using only the latest official HPO dataset as the data source, the lightweight sentence-embedding model BAAI/bge-small-en-v1.5 was employed to construct a FAISS vector index. During the online phase, a four-step closed-loop process is automatically completed: multilingual translation, phenotype phrase extraction, RAG candidate retrieval, term mapping, and official database validation. 121 English case reports publicly released by BMJ Case Reports and Oxford Medical Case Reports (with a gold-standard HPO set of 1 794 terms) were selected for application validation. Precision, recall, and F1 score were calculated and compared horizontally with traditional dictionary tools, standalone large language models, and the similar application "RAG-HPO". Finally, replace the model with the more advanced ChatGPT-5 and evaluate its performance on the newly extracted dataset.
RESULTS:
An HPO term automatic annotation visualization application named PhenoRAG, based on large language models and RAG technology, was successfully developed. Users can access it directly via a web link. Across the 112 cases, a total of 2 150 HPO terms were generated; 2,064 (96.0%) were fully validated by the official database, with a hallucination rate of 1.3% and an HPO ID-name mismatch rate of 2.7%. After deduplication, 1,906 terms remained for testing. The overall precision was 63.65%, recall was 67.34%, and F1 was 65.44%, significantly outperforming traditional annotation tools (F1: 0.45-0.49, P < 0.001). Although PhenoRAG's F1 was lower than that of RAG-HPO (F1 = 0.78, P < 0.001), which relies on a manually constructed synonym database of 54 000 entries plus the HPO dataset, it requires no additional dictionary maintenance and can be used without any background in computer programming. Moreover, after switching to the GPT-5 model, PhenoRAG exhibited no hallucination rate on the new dataset, and its F1 score significantly increased (P = 0.038).
CONCLUSION
Without constructing a synonym database, the PhenoRAG achieved high-accuracy automatic mapping from clinical text to standard HPO terms. It features a low usage threshold, free access, and a Chinese-language interface, and can directly serve rare disease diagnosis, genetic counseling, and research scenarios in China and worldwide, warranting further clinical promotion and multicenter validation.
Humans
;
Phenotype
;
Biological Ontologies
;
Language
;
Software
;
Large Language Models
2.Genetic analysis and prenatal diagnosis of structural brain abnormalities associated with TUBB gene c.155A>G variant.
Yifan LIU ; Wei SONG ; Xinlian WANG ; Yan RUAN ; Meng ZHANG ; Yujiao CHEN ; Yan LIU ; Puqing ZHANG ; Li WANG ; Yousheng YAN
Chinese Journal of Medical Genetics 2026;43(2):136-142
OBJECTIVE:
To explore the genotype-phenotype correlation in a Chinese family with structural brain abnormalities due to variant of the TUBB gene.
METHODS:
A family undergoing prenatal diagnosis at Beijing Obstetrics and Gynecology Hospital in October 2024 was selected as the study subject. Clinical data were collected. Amniotic fluid sample was subjected to chromosomal copy number variation sequencing (CNV-seq). Trio whole-exome sequencing (Trio-WES) was carried out on the amniotic fluid and parental blood samples, and candidate variant was verified by Sanger sequencing. This study was approved by the Medical Ethics Committee of the hospital (Ethics No.: 2023-KY-076-01).
RESULTS:
Both prenatal ultrasound and fetal MRI showed deviation of brain midline, unilateral lateral ventriculomegaly, and bilateral gyral asymmetry. Trio-WES revealed that the fetus has harbored a maternally derived heterozygous missense variant of the TUBB gene [NM_178014.4: c.155A>G (p.N52S)]. Sanger sequencing confirmed that the woman and a previously terminated fetus both harbored the same variant. Both the proband and two fetuses exhibited similar neuroimaging abnormalities including midline deviation and asymmetrical gyri. Based on the guidelines from the American College of Medical Genetics and Genomics (ACMG), the variant was classified as likely pathogenic (PM2_Supporting+PS2_Moderate+PS3).
CONCLUSION
The heterozygous c.155A>G (p.N52S) variant was the TUBB gene probably underlay the pathogenesis of the structural brain abnormalities in this family. Above findings have expanded the phenotypic spectrum associated with the variant and facilitated the prenatal diagnosis for this family.
Humans
;
Female
;
Pregnancy
;
Prenatal Diagnosis
;
Tubulin/genetics*
;
Adult
;
Brain/diagnostic imaging*
;
Male
;
Pedigree
;
DNA Copy Number Variations/genetics*
;
Exome Sequencing
;
Genetic Association Studies
;
Magnetic Resonance Imaging
3.Genetic analysis of a de novo EFTUD2 variant causing Mandibulofacial dysostosis with microcephaly in a fetus.
Jianyu REN ; Xiaojiao GUAN ; Shuang LIU ; Yousheng YAN ; Shufa YANG
Chinese Journal of Medical Genetics 2026;43(4):288-294
OBJECTIVE:
To investigate the genetic etiology of a fetus diagnosed with Mandibulofacial dysostosis with microcephaly (MFDM).
METHODS:
A fetus that underwent prenatal diagnosis at Beijing Obstetrics and Gynecology Hospital, Capital Medical University, on May 19, 2025 was selected for analysis. Results of fetal ultrasound findings, chromosomal karyotyping, copy number variation sequencing (CNV-seq), and whole-exome sequencing (WES) were collected. Sanger sequencing was performed for familial validation of the pathogenic variant. The Human Protein Atlas (HPA), STRING, and Simple ClinVar databases were queried to characterize the biological features of the candidate gene. Three-dimensional structures of the wild-type and variant proteins were modeled and analyzed, and the evolutionary conservation of the affected amino acid was assessed using UGENE. Prenatal phenotypes associated with EFTUD2 variants were summarized through a review of the literature. This study was approved by the Ethics Committee of Beijing Obstetrics and Gynecology Hospital, Capital Medical University (Ethics No.: 2025-KY-029-01).
RESULTS:
At 23+2 weeks of gestation, ultrasound examination revealed bilateral microtia with low-set ears, mild micrognathia with a reduced mandibular-facial angle, a single umbilical artery, a slightly narrow aortic diameter, and trivial mitral regurgitation. Amniotic fluid karyotyping and CNV-seq showed no abnormalities. WES identified a de novo, previously unreported EFTUD2 variant, c.698dupA (p.V235Gfs*27), in the fetus. This frameshift variant is predicted to alter the structural integrity of the EFTUD2 protein. Literature review indicated that micrognathia and microtia or low-set ears are the most common sonographic features in fetuses with EFTUD2 variants, while secondary findings may include abnormal stomach bubble, cleft palate, single umbilical artery, gastrointestinal atresia, polyhydramnios, and reduced aortic diameter.
CONCLUSION
The EFTUD2: c.698dupA (p.V235Gfs*27) variant is likely the genetic cause underlying MFDM in this fetus.
Humans
;
Mandibulofacial Dysostosis/diagnostic imaging*
;
Microcephaly/diagnostic imaging*
;
Female
;
Pregnancy
;
Ribonucleoprotein, U5 Small Nuclear/chemistry*
;
Peptide Elongation Factors/chemistry*
;
Fetus
;
DNA Copy Number Variations/genetics*
;
Adult
;
Ultrasonography, Prenatal
4.Refined logistics management in hospitals based on information system
Lili KONG ; Yousheng XIAO ; Yupeng YAN ; Zhijie CHEN
Modern Hospital 2024;24(2):280-282
Logistics informatization is importantfor promoting the high-quality development of public hospitals.It is a driving force for innovating hospital logistics management and an important practice of"green hospitals"and"smart hospitals".The logistics information system can effectively integrate people,machines,materials,and events of hospitals to achieve data-driven scientific management and improve service and management efficiency.By analyzing the current status of logistics manage-ment in the Sun Yat-sen University Cancer Center,this article proposes a transformation path to and management ideas for hospi-tal refined logistics management based on the information system,expecting to provide an insight into future information construc-tion and hospital logistics management development.
5.Practice of refined management throughout the whole process of sporadic repair projects in public hospitals
Yupeng YAN ; Lili KONG ; Zixiao JIANG ; Ming CHEN ; Taiying ZHOU ; Yousheng XIAO
Modern Hospital 2024;24(3):413-415,419
As public hospitals continue to expand,buildings continue to age,sporadic renovation projects are increas-ing,and expenditures are increasing.In order to ensure the safe,stable and efficient operation of the hospital,the piecemeal re-pair project has become an important basic guarantee for the hospital.There are many kinds of sporadic repair projects,and the projects are trivial and scattered.The contradictions among the needs,cost control,management ability and service quality of sporadic repair projects are becoming increasingly prominent,which has become the difficulty and pain point of logistics service management.In the practice of hospital sporadic repair project management,the traditional project management mode is broken,the whole process of fine management system is established,the level of management personnel and the whole process of the pro-ject are effectively integrated,and the management ability and service quality of sporadic maintenance projects are comprehensive-ly improved.
6.An evaluation of carrier detection for Spinal muscular atrophy using digital PCR assay
Yousheng YAN ; Chianru TAN ; Meng ZHANG ; Fang WANG ; Yipeng WANG ; Xinwen CHEN ; Chenghong YIN ; Yong GUO
Chinese Journal of Medical Genetics 2024;41(1):20-24
Objective:To assess the effectiveness and feasibility of carrier detection for Spinal muscular atrophy (SMA) by using digital PCR assay.Methods:Peripheral blood samples were collected from 214 pregnant women who were routinely screened for SMA carriers, of which 204 were randomly selected samples and 10 were samples with known copy numbers of SMN1 exons 7 and 8. Samples with known copy numbers of SMN1 exons 7 and 8 were randomly mixed into the experiment to validate the performance of the digital PCR assay. The copy numbers of SMN1 exons 7 and 8 and SMN2 exons 7 and 8 in peripheral blood samples were detected by digital PCR assay. The results of SMN1 exons 7 and 8 were compared with those of the quantitative PCR method to assess the reliability and clinical performance of the digital PCR assay. Results:Among the 204 random samples, digital PCR has detected five samples with simultaneous heterozygous deletion of SMN1 exons 7 and 8, three samples with heterozygous deletion of SMN1 exon 8 only, and 196 samples with no deletion of SMN1 exons 7 and 8. Ten samples with known SMN1 exons 7 and 8 copy numbers were detected with the expected values. The digital PCR test results were fully consistent with that of the quantitative PCR. Conclusion:The results of digital PCR for the detection of copy number variation of SMN1 exons 7 and 8 were consistent with qPCR. Digital PCR assay was able to clearly distinguish the copy number of the target genes, therefore can be used for SMA carrier screening. Moreover, it can also detect copy number of SMN2 exons 7 and 8, which can provide more information for genetic counseling.
7.Expression of METTL14 in epithelial ovarian cancer and the effect on cell proliferation, invasion and migration of A2780 and SKOV3 cells
Yousheng WEI ; Desheng YAO ; Li LI ; Yan LU ; Xinmei YANG ; Wenge ZHANG
Chinese Journal of Obstetrics and Gynecology 2022;57(1):46-56
Objective:To study the expression of methyltransferase-like protein 14 (METTL14) in epithelial ovarian cancer and its clinical significance, and to explore the effect of METTL14 expression on the proliferation, invasion and migration of ovarian cancer cells.Methods:Immunohistochemistry (IHC) was used to detect METTL14 expression in tumor tissue samples, and analyze the relationships among METTL14 expression, clinicopathological factors, and prognosis in ovarian cancer. Lentiviral vectors and small interfering RNA (siRNA) were used to up-regulate and down-regulate the METTL14 expression in ovarian cancer cell lines A2780 and SKOV3, respectively. Liquid chromatography-tandem mass spectrometry (LC-MS/MS) method was used to detect the N6-methyladenosine (m6A) content in ovarian cancer cells. Cell counting kit-8 (CCK-8), wound healing assay, and transwell assay were used to examine the function of METTL14 expression in the cells.Results:(1) The IHC score of METTL14 protein was 6.2±3.7 in 20 samples of ovarian cancer tissues and 3.3±2.5 in 15 samples of normal ovarian tissues, and the difference was statistically significant ( t=-2.64, P=0.012). Among the patients who suffered from ovarian cancer, there were 69 cases with high expression of METTL14 protein (IHC score≥6), accounting for 57.0% (69/121), and the cases with low expression of METTL14 protein (IHC score<6) accounting for 43.0% (52/121). Compared with the patients with low expression of METTL14, the patients with high expression of METTL14 had later stages, higher rates of lymph node metastasis, abdominal metastasis, and more ascite amount. The differences were statistically significant (all P<0.05). The overall survival rate was significantly lower in patients with high METTL14 expression than the low expression ( P=0.009). (2) LC-MS/MS data showed that the relative expression of m6A in A2780 and SKOV3 cells in the lentivirus (LV)-METTL14 group were 0.213±0.024 and 0.181±0.018, which were significantly higher than those in the LV-normal control (NC) group (0.109±0.022 and 0.128±0.020; all P<0.05). While the relative expression of m6A in A2780 and SKOV3 cells in the si-METTL14 group were 0.063±0.012 and 0.069±0.015, which were significantly lower than the expression in si-NC group of 0.108±0.014 and 0.121±0.014 (all P<0.05). CCK-8 assay showed that the absorbance values were significantly lower in the si-METTL14 group compared with the si-NC group at 36, 48, 60 hours (all P<0.05); while were significantly increased in the LV-METTL14 group compared with the LV-NC group at 48, 60 hours (all P<0.01). Scratch wound assays showed that the migration rate of the si-METTL14 group was lower than those of the si-NC group, while the LV-METTL14 group were higher than the LV-NC group by 24 hours, the differences were statistically significant (all P<0.01). Cell migration and invasion were detected by transwell migration and invasion assays. After cultivated for 24 hours, the invasion cell number and the migration cell number in the si-METTL14 group were less than those in the si-NC group. While the invasion cell number and the migration cell number in the LV-METTL14 group were more than those in the LV-NC group, respectively. The differences were statistically significant (all P<0.01). Conclusion:Patients with high METTL14 expression have a worse prognosis in ovarian cancer, which may increase the m6A modification of ovarian cancer cells and promote cells proliferation, invasion and migration.
8.Progress of research on clinical use of non-invasive prenatal screening for special groups of pregnant women.
Yousheng YAN ; Yipeng WANG ; Yan LIU ; Chenghong YIN
Chinese Journal of Medical Genetics 2021;38(7):694-698
As a prenatal testing for chromosomal abnormalities, non-invasive prenatal testing (NIPT) has been integrated into prenatal healthcare service. NIPT has shown a high sensitivity and specificity for screening fetal trisomies 13, 18 and 21, and has attained excellent clinical results. With the propagation of the NIPT screening, international organizations have issued guidelines and comments for its clinical utility with regular updating. China has also developed guidelines for NIPT in 2016. NIPT guidelines in various countries have provided valuable guidance for its target diseases and suitable patient groups, but there has been few research data on its clinical application for special groups of patients. Based on the guidelines and comments of various professional bodies and published data on the clinical utility of NIPT, in addition with consideration of the conditions in China, clinical utility of NIPT for particular groups of pregnant women, including those with advanced maternal age, obesity, twin pregnancy and fetal ultrasonographic anomalies, are reviewed. The value of genetic counseling for NIPT is also emphasized, which is critical for the clinical application of NIPT.
China
;
Chromosome Aberrations
;
Female
;
Humans
;
Pregnancy
;
Pregnant Women
;
Prenatal Diagnosis
;
Trisomy 13 Syndrome
9.Identification of a novel c.1A>G variant of GDAP1 gene in a pedigree affected with autosomal recessive fibula atrophy.
Chunlian LIU ; Yousheng YAN ; Junli ZHAO ; Lingxia HA ; Xian XU
Chinese Journal of Medical Genetics 2020;37(11):1244-1246
OBJECTIVE:
To explore the genetic basis for a pedigree affected with Charcot-Marie-Tooth (CMT) disease through high-throughput sequencing.
METHODS:
Potential variants of the genes associated with CMT were screened by next-generation sequencing (NGS) of the members of the pedigree.
RESULTS:
NGS has revealed that the two affected sisters both harbored homozygous c.1A>G variant of the GDAP1 gene, which caused replacement of the first amino acid Methionine by Valine (p.Met1Val). Their parents were both carriers of the heterozygous c.1A>G variant. The variant was unreported previously and has an extremely low frequency in the population. Meanwhile, one of the sisters and the mother also carried heterozygous c.710A>T variant of the BAG3 gene.
CONCLUSION
The homozygous c.1A>G variant of the GDAP1 gene probably underlay the CMT in both children. Above result has enabled clinical diagnosis and genetic counseling for this pedigree.
Adaptor Proteins, Signal Transducing/genetics*
;
Apoptosis Regulatory Proteins/genetics*
;
Charcot-Marie-Tooth Disease/genetics*
;
Child
;
Female
;
Fibula/abnormalities*
;
Homozygote
;
Humans
;
Mutation
;
Nerve Tissue Proteins/genetics*
;
Pedigree
10.A consensus recommendation for the interpretation and reporting of copy number variation and regions of homozygosity in prenatal genetic diagnosis.
Weiqiang LIU ; Jian LU ; Jun ZHANG ; Ru LI ; Shaobin LIN ; Yan ZHANG ; Yousheng WANG ; Aihua YIN
Chinese Journal of Medical Genetics 2020;37(7):701-708
Chromosomal microdeletions and microduplications have been proven to be a significant proportion of genetic factors underlying birth defects. Chromosomal microarray analysis (CMA) and next generation sequencing-based copy number variation (CNV-seq) assay have been recommended as first-tier tests for prenatal evaluation of disease-causing CNV across the genome. With the broad application of such technologies in prenatal genetic diagnosis, there is a needed to enhance the consistency in interpretation and reporting of CNV results in clinical laboratories across China. In addition, a standard guideline for prenatal analysis and reporting of regions of homozygosity (ROH) is also required. To assist the classification, interpretation and reporting of CNV/ROH, the following recommendations have been developed, which may enhance a standard application of CMA/CNV-seq techniques in prenatal genetic diagnosis.

Result Analysis
Print
Save
E-mail