1.Automated syndrome element differentiation in traditional Chinese medicine based on large language models and text embedding computation
Zhaoyang SUN ; Yang WANG ; Mingze MA ; Yanwen CHEN ; Zhenxiu LYU ; Tiantian JIANG ; Huiling WEN ; Bo CHEN ; Jing GUAN
Journal of Beijing University of Traditional Chinese Medicine 2025;48(8):1176-1184
Objective This study aimed to develop an automated method for syndrome element differenti-ation in Traditional Chinese Medicine(TCM).Methods We first constructed and trained an Instruction-tuned Multi-Task TCM text embedding model(Instr-MT-TCM)using four distinct TCM task datasets,including domain knowledge,synonymous terminology,syndrome differentiation and treatment,and TCM case labels.Subsequently,five TCM diagnostics experts holding master's degrees or higher were organized to screen a real-world TCM case dataset and annotate symptoms and signs.The purpose was to evaluate the F1-score of the proposed method—the combination of Instr-MT-TCM and a Large Language Model(LLM)—by comparing its performance against the manual annotation result on the syndrome element differentiation task.Finally,to validate its feasibility in real-world clinical settings,the method was applied to 48 prostate cancer cases to calculate the syndrome element scores.Results The Instr-MT-TCM model showed rapid performance improvement in its early training phase,achieving a Recall@1(R@1)of 0.848.Experts curated a dataset of 1,793 real-world clinical cases,covering 34 common diseases and 66 syndrome patterns.In the syndrome element differentiation task,the collaborative framework of LLM and Instr-MT-TCM achieved a mean F1-score of 0.927,outperforming the 0.512 from manual annota-tion.The syndrome element analysis revealed that the predominant elements of disease nature were fire(heat)and yin deficiency,while the main elements of disease location were bladder and kidney.Conclusion This study proposes and validates a novel method for automated TCM syndrome element dif-ferentiation based on the synergy between LLM and our custom Instr-MT-TCM model.Achieving a high F1-score(0.927)on real-world data,the method demonstrates excellent accuracy and generalization ability.Its application in prostate cancer analysis highlights its significant clinical potential,offering effective technical support,and a new research direction for intelligent TCM syndrome element differentiation.
2.Automated syndrome element differentiation in traditional Chinese medicine based on large language models and text embedding computation
Zhaoyang SUN ; Yang WANG ; Mingze MA ; Yanwen CHEN ; Zhenxiu LYU ; Tiantian JIANG ; Huiling WEN ; Bo CHEN ; Jing GUAN
Journal of Beijing University of Traditional Chinese Medicine 2025;48(8):1176-1184
Objective This study aimed to develop an automated method for syndrome element differenti-ation in Traditional Chinese Medicine(TCM).Methods We first constructed and trained an Instruction-tuned Multi-Task TCM text embedding model(Instr-MT-TCM)using four distinct TCM task datasets,including domain knowledge,synonymous terminology,syndrome differentiation and treatment,and TCM case labels.Subsequently,five TCM diagnostics experts holding master's degrees or higher were organized to screen a real-world TCM case dataset and annotate symptoms and signs.The purpose was to evaluate the F1-score of the proposed method—the combination of Instr-MT-TCM and a Large Language Model(LLM)—by comparing its performance against the manual annotation result on the syndrome element differentiation task.Finally,to validate its feasibility in real-world clinical settings,the method was applied to 48 prostate cancer cases to calculate the syndrome element scores.Results The Instr-MT-TCM model showed rapid performance improvement in its early training phase,achieving a Recall@1(R@1)of 0.848.Experts curated a dataset of 1,793 real-world clinical cases,covering 34 common diseases and 66 syndrome patterns.In the syndrome element differentiation task,the collaborative framework of LLM and Instr-MT-TCM achieved a mean F1-score of 0.927,outperforming the 0.512 from manual annota-tion.The syndrome element analysis revealed that the predominant elements of disease nature were fire(heat)and yin deficiency,while the main elements of disease location were bladder and kidney.Conclusion This study proposes and validates a novel method for automated TCM syndrome element dif-ferentiation based on the synergy between LLM and our custom Instr-MT-TCM model.Achieving a high F1-score(0.927)on real-world data,the method demonstrates excellent accuracy and generalization ability.Its application in prostate cancer analysis highlights its significant clinical potential,offering effective technical support,and a new research direction for intelligent TCM syndrome element differentiation.

Result Analysis
Print
Save
E-mail