TCMLLM-PR:evaluation of large language models for prescription recommendation in traditional Chinese medicine

Haoyu TIAN; Kuo YANG; Xin DONG; Chenxi ZHAO; Mingwei YE; Hongyan WANG; Yiming LIU; Minjie HU; Qiang ZHU; Jian YU; Lei ZHANG; Xuezhong ZHOU

Return

TCMLLM-PR:evaluation of large language models for prescription recommendation in traditional Chinese medicine

VernacularTitle:TCMLLM-PR:中医处方推荐大模型评价
Author: Haoyu TIAN ¹ ; Kuo YANG ; Xin DONG ; Chenxi ZHAO ; Mingwei YE ; Hongyan WANG ; Yiming LIU ; Minjie HU ; Qiang ZHU ; Jian YU ; Lei ZHANG ; Xuezhong ZHOU
Author Information

1. 北京交通大学计算机科学与技术学院交通数据分析与挖掘北京市重点实验室,北京 100044,中国
Publication Type:Journal Article
Keywords: Large language models; Instruction-tuning; Prescription recommendation; Traditional Chinese medicine(TCM); Assisted decision-making
From: Digital Chinese Medicine 2024;7(4):343-355
CountryChina
Language:Chinese
Abstract: Objective To develop and evaluate a fine-tuned large language model(LLM)for traditional Chinese medicine(TCM)prescription recommendation named TCMLLM-PR.Methods First,we constructed an instruction-tuning dataset containing 68 654 samples(ap-proximately 10 million tokens)by integrating data from eight sources,including four TCM textbooks,Pharmacopoeia of the People's Republic of China 2020(CHP),Chinese Medicine Clinical Cases(CMCC),and hospital clinical records covering lung disease,liver disease,stroke,diabetes,and splenic-stomach disease.Then,we trained TCMLLM-PR using Chat-GLM-6B with P-Tuning v2 technology.The evaluation consisted of three aspects:(i)compari-son with traditional prescription recommendation models(PTM,TCMPR,and PresRecST);(ii)comparison with TCM-specific LLMs(ShenNong,Huatuo,and HuatuoGPT)and general-domain ChatGPT;(iii)assessment of model migration capability across different disease datasets.We employed precision,recall,and F1 score as evaluation metrics.Results The experiments showed that TCMLLM-PR significantly outperformed baseline models on TCM textbooks and CHP datasets,with F1@10 improvements of 31.80%and 59.48%,respectively.In cross-dataset validation,the model performed best when migrating from TCM textbooks to liver disease dataset,achieving an F1@10 of 0.155 1.Analysis of real-world cases demonstrated that TCMLLM-PR's prescription recommendations most closely matched actual doctors'prescriptions.Conclusion This study integrated LLMs into TCM prescription recommendations,leverag-ing a tailored instruction-tuning dataset and developing TCMLLM-PR.This study will pub-licly release the best model parameters of TCMLLM-PR to promote the development of the decision-making process in TCM practices(https://github.com/2020MEAI/TCMLLM).