1.Research on Lightweight Large Language Models for Ancient Traditional Chinese Medicine Texts Based on Lora Fine-Tuning
Jingxian CHAI ; Xufeng LANG ; Hongyan LI ; Zuojian ZHOU ; Yun LING ; Libin ZHAN ; Kongfa HU ; Xuebin QIAO
World Science and Technology-Modernization of Traditional Chinese Medicine 2025;27(3):823-831
Objective To address the challenges of constructing large language models for traditional Chinese medicine(TCM)classics,which are complex and expensive to fine-tune,this study explores a lightweight fine-tuning method for such models,aiming to develop a question-answering model centered on TCM classics,particularly various editions of Shang Han Lun through the ages.Methods Dataset construction involved designing prompts to guide GPT-4 in generating Q&A pairs based on Shang Han Lun and integrating them with the ShenNong_TCM_Dataset and cMedQA2 datasets.Five general-purpose large models were selected for Lora fine-tuning.The best model was chosen through evaluation,and the performance of multiple quantized versions was validated.Results After fine-tuning,the BLEU,ROUGE-1,ROUGE-2,and ROUGE-L metrics for the Qwen-7B-Chat model improved by 17.61,19.63,14.3,and 21.4,respectively,compared to the base model.Conclusion The selected model in this study is capable of effectively understanding and utilizing professional terms and concepts from TCM classics,such as Shang Han Lun,to provide accurate answers to user queries.Compared to similar models,it requires lower fine-tuning costs and computational power,contributing to the dissemination of TCM knowledge and the development of intelligent systems.
2.Research on Lightweight Large Language Models for Ancient Traditional Chinese Medicine Texts Based on Lora Fine-Tuning
Jingxian CHAI ; Xufeng LANG ; Hongyan LI ; Zuojian ZHOU ; Yun LING ; Libin ZHAN ; Kongfa HU ; Xuebin QIAO
World Science and Technology-Modernization of Traditional Chinese Medicine 2025;27(3):823-831
Objective To address the challenges of constructing large language models for traditional Chinese medicine(TCM)classics,which are complex and expensive to fine-tune,this study explores a lightweight fine-tuning method for such models,aiming to develop a question-answering model centered on TCM classics,particularly various editions of Shang Han Lun through the ages.Methods Dataset construction involved designing prompts to guide GPT-4 in generating Q&A pairs based on Shang Han Lun and integrating them with the ShenNong_TCM_Dataset and cMedQA2 datasets.Five general-purpose large models were selected for Lora fine-tuning.The best model was chosen through evaluation,and the performance of multiple quantized versions was validated.Results After fine-tuning,the BLEU,ROUGE-1,ROUGE-2,and ROUGE-L metrics for the Qwen-7B-Chat model improved by 17.61,19.63,14.3,and 21.4,respectively,compared to the base model.Conclusion The selected model in this study is capable of effectively understanding and utilizing professional terms and concepts from TCM classics,such as Shang Han Lun,to provide accurate answers to user queries.Compared to similar models,it requires lower fine-tuning costs and computational power,contributing to the dissemination of TCM knowledge and the development of intelligent systems.
3.Entity Recognition in Treatise on Cold Damage Based on Relative Position Representation Self-Attention Mechanism
Hongmin XU ; Hongyan LI ; Xufeng LANG ; Zuojian ZHOU ; Yun LING ; Ziyan WANG
Journal of Nanjing University of Traditional Chinese Medicine 2024;40(12):1357-1365
OBJECTIVE Treatise on Cold Damage is one of the"Four Classics of Traditional Chinese Medicine,"containing a wealth of medical practice experience and medication rules.However,there has been insufficient data mining in the ancient literature of Treatise on Cold Damage,particularly due to the complex contextual semantics,making it challenging to fully grasp the interrelation-ships.This study aims to conduct entity recognition in Treatise on Cold Damage to facilitate comprehensive knowledge extraction.METHODS A Bert-BiLSTM-RPRSA-CRF model was constructed based on the specialized terminology and concise sentence struc-ture of the ancient literature.By incorporating a relative position representation self-attention(RPRSA)layer,this named entity recog-nition model aimed to identify entities within the text while learning information at different levels,thereby enhancing accuracy.RE-SULTS Experimental verification demonstrated that our named entity recognition model achieved F1-Score,precision,and recall rates of 88.24%,88.48%,and 88.00%respectively on the Treatise on Cold Damage dataset,outperforming other commonly used models.CONCLUSION Our method outperforms other models in identifying entities within Treatise on Cold Damage,providing a foundation for information extraction from traditional Chinese medicine ancient texts such as Treatise on Cold Damage while offering ef-fective means for intelligent assisted diagnosis and treatment in traditional Chinese medicine.
4.Definition and Extraction of Traditional Chinese Medicine Inspection Gait Features Based on Three-Dimensional Key Points of the Human Body
Aihua GUAN ; Jilong SHEN ; Ziyan WANG ; Qi ZHANG ; Tao YANG ; Xufeng LANG ; Jiadong XIE ; Kongfa HU
Journal of Nanjing University of Traditional Chinese Medicine 2024;40(12):1331-1339
OBJECTIVE To analyze the differences in gait features between patients with cardiovascular and cerebrovascular dis-eases and normal people,and to explore new objective features of traditional Chinese medicine(TCM)whole-body inspection.METHODS A monocular camera was used to collect frontal walking videos of subjects,and the diagnosis results of TCM practitioners were used as disease annotation data;a deep learning model was used to estimate the three-dimensional coordinates of key points;the gait features were defined and calculated based on the three-dimensional coordinates of key points of the lower limbs;differences in gait features among people with cardiovascular and cerebrovascular diseases were collected and verified.RESULTS The three-di-mensional coordinates of key points of the lower limbs were automatically extracted and 8 types of TCM gait features were calculated:step width,stride length,foot lift height,limb angle,left and right hip joint angles,and left and right knee joint angles.It was found that there were significant differences in the features between people with cardiovascular and cerebrovascular diseases and healthy peo-ple(P<0.05).CONCLUSION The TCM inspection gait extracted by this study can effectively distinguish patients with cardiovas-cular and cerebrovascular diseases from healthy people,expands the research scope of TCM whole-body inspection,and provides new ideas for the early detection and prevention of cardiovascular and cerebrovascular diseases.
5.Construction of Traditional Chinese Medicine Question-Answering Large Language Model Based on Retrieval-Augmented Generation Technology
Yuming ZHANG ; Hongyan LI ; Xufeng LANG ; Zuojian ZHOU ; Yun LING ; Ziyan WANG
Journal of Nanjing University of Traditional Chinese Medicine 2024;40(12):1375-1382
OBJECTIVE To construct a large language model for TCM question-answering.METHODS TCM corpora were built by collecting TCM classics such as Treatise on Cold Damage,TCM textbooks,prescriptions from famous TCM doctors,and other manually annotated TCM datasets.A TCM knowledge vector library was constructed.The RAG technology was fused with the P-Tuning v2 fine-tuning method and the large language model(ChatGLM2-6B)to build the TCM question-answering large language model.RESULTS Recision,Recall,and F1 score were used as evaluation metrics for knowledge question-answering tasks.The model achieved over 90%accuracy in simple TCM question-answering,with the highest accuracy in component-type questions,reac-hing an F1 score of 0.928.The accuracy of medium to high difficulty questions ranged from 75.8%to 87.7%,with F1 scores all ex-ceeding 0.766.Expert ratings based on diversity and accuracy were used as evaluation metrics for TCM question generation tasks,and the model in this paper scored 9.5 points higher than the baseline model.CONCLUSION The model in this paper demonstrates good semantic understanding and high reliability,effectively alleviating model hallucinations and helping patients clarify their question intentions.It is of great significance for advancing research on TCM knowledge and providing personalized interactive answers.It also provides an innovative approach to promoting the inheritance and popularization of TCM experience and the intelligent construction of TCM diagnosis and treatment.
6.Entity Recognition in Treatise on Cold Damage Based on Relative Position Representation Self-Attention Mechanism
Hongmin XU ; Hongyan LI ; Xufeng LANG ; Zuojian ZHOU ; Yun LING ; Ziyan WANG
Journal of Nanjing University of Traditional Chinese Medicine 2024;40(12):1357-1365
OBJECTIVE Treatise on Cold Damage is one of the"Four Classics of Traditional Chinese Medicine,"containing a wealth of medical practice experience and medication rules.However,there has been insufficient data mining in the ancient literature of Treatise on Cold Damage,particularly due to the complex contextual semantics,making it challenging to fully grasp the interrelation-ships.This study aims to conduct entity recognition in Treatise on Cold Damage to facilitate comprehensive knowledge extraction.METHODS A Bert-BiLSTM-RPRSA-CRF model was constructed based on the specialized terminology and concise sentence struc-ture of the ancient literature.By incorporating a relative position representation self-attention(RPRSA)layer,this named entity recog-nition model aimed to identify entities within the text while learning information at different levels,thereby enhancing accuracy.RE-SULTS Experimental verification demonstrated that our named entity recognition model achieved F1-Score,precision,and recall rates of 88.24%,88.48%,and 88.00%respectively on the Treatise on Cold Damage dataset,outperforming other commonly used models.CONCLUSION Our method outperforms other models in identifying entities within Treatise on Cold Damage,providing a foundation for information extraction from traditional Chinese medicine ancient texts such as Treatise on Cold Damage while offering ef-fective means for intelligent assisted diagnosis and treatment in traditional Chinese medicine.
7.Definition and Extraction of Traditional Chinese Medicine Inspection Gait Features Based on Three-Dimensional Key Points of the Human Body
Aihua GUAN ; Jilong SHEN ; Ziyan WANG ; Qi ZHANG ; Tao YANG ; Xufeng LANG ; Jiadong XIE ; Kongfa HU
Journal of Nanjing University of Traditional Chinese Medicine 2024;40(12):1331-1339
OBJECTIVE To analyze the differences in gait features between patients with cardiovascular and cerebrovascular dis-eases and normal people,and to explore new objective features of traditional Chinese medicine(TCM)whole-body inspection.METHODS A monocular camera was used to collect frontal walking videos of subjects,and the diagnosis results of TCM practitioners were used as disease annotation data;a deep learning model was used to estimate the three-dimensional coordinates of key points;the gait features were defined and calculated based on the three-dimensional coordinates of key points of the lower limbs;differences in gait features among people with cardiovascular and cerebrovascular diseases were collected and verified.RESULTS The three-di-mensional coordinates of key points of the lower limbs were automatically extracted and 8 types of TCM gait features were calculated:step width,stride length,foot lift height,limb angle,left and right hip joint angles,and left and right knee joint angles.It was found that there were significant differences in the features between people with cardiovascular and cerebrovascular diseases and healthy peo-ple(P<0.05).CONCLUSION The TCM inspection gait extracted by this study can effectively distinguish patients with cardiovas-cular and cerebrovascular diseases from healthy people,expands the research scope of TCM whole-body inspection,and provides new ideas for the early detection and prevention of cardiovascular and cerebrovascular diseases.
8.Construction of Traditional Chinese Medicine Question-Answering Large Language Model Based on Retrieval-Augmented Generation Technology
Yuming ZHANG ; Hongyan LI ; Xufeng LANG ; Zuojian ZHOU ; Yun LING ; Ziyan WANG
Journal of Nanjing University of Traditional Chinese Medicine 2024;40(12):1375-1382
OBJECTIVE To construct a large language model for TCM question-answering.METHODS TCM corpora were built by collecting TCM classics such as Treatise on Cold Damage,TCM textbooks,prescriptions from famous TCM doctors,and other manually annotated TCM datasets.A TCM knowledge vector library was constructed.The RAG technology was fused with the P-Tuning v2 fine-tuning method and the large language model(ChatGLM2-6B)to build the TCM question-answering large language model.RESULTS Recision,Recall,and F1 score were used as evaluation metrics for knowledge question-answering tasks.The model achieved over 90%accuracy in simple TCM question-answering,with the highest accuracy in component-type questions,reac-hing an F1 score of 0.928.The accuracy of medium to high difficulty questions ranged from 75.8%to 87.7%,with F1 scores all ex-ceeding 0.766.Expert ratings based on diversity and accuracy were used as evaluation metrics for TCM question generation tasks,and the model in this paper scored 9.5 points higher than the baseline model.CONCLUSION The model in this paper demonstrates good semantic understanding and high reliability,effectively alleviating model hallucinations and helping patients clarify their question intentions.It is of great significance for advancing research on TCM knowledge and providing personalized interactive answers.It also provides an innovative approach to promoting the inheritance and popularization of TCM experience and the intelligent construction of TCM diagnosis and treatment.
9.Simplified Study of Constitution in Chinese Medicine Questionnaire Based on Genetic Algorithm and KNN Method
Shutao GUAN ; Hongyan LI ; Xufeng LANG ; Can LI ; Zuojian ZHOU ; Kongfa HU ; Libin ZHAN
World Science and Technology-Modernization of Traditional Chinese Medicine 2023;25(10):3364-3369
Objective Aiming at the problems of many items and long time to fill in the Constitution in Chinese Medicine Questionnaire(CCMQ)when evaluating individual constitution,the research uses artificial intelligence technology to select attributes,and to help construct a short version of the CCMQ.Methods Analyzing the constitution data provided by the Physical Examination Department of Jiangsu Province Hospital of Traditional Chinese Medicine,there are specific target variables as the classification of constitution types.Feature selection of genetic algorithm,cross-validation and KNN classification algorithm are used as filters to select problems,and the effect is evaluated by problem subset size,KNN classification accuracy and filling time.Results The method selected a short version of the CCMQ with 31 problems,and the average classification accuracy in the model was 86.16%,and the time was improved by 47.7%.Conclusion The algorithm can effectively find a better problem subset,achieve dimensionality reduction and have certain accuracy,thus helping to simplify the CCMQ.

Result Analysis
Print
Save
E-mail