1.QingNangTCM: a parameter-efficient fine-tuning large language model for traditional Chinese medicine
Xuming TONG ; Liyan LIU ; Yanhong YUAN ; Xiaozheng DING ; Huiru JIA ; Xu YANG ; Sio Kei IM ; Mini Han WANG ; Zhang XIONH ; Yapeng WANG
Digital Chinese Medicine 2026;9(1):1-12
Objective:
To develop QingNangTCM, a specialized large language model (LLM) tailored for expert-level traditional Chinese medicine (TCM) question-answering and clinical reasoning, addressing the scarcity of domain-specific corpora and specialized alignment.
Methods:
We constructed QnTCM_Dataset, a corpus of 100 000 entries, by integrating data from ShenNong_TCM_Dataset and SymMap v2.0, and synthesizing additional samples via retrieval-augmented generation (RAG) and persona-driven generation. The dataset comprehensively covers diagnostic inquiries, prescriptions, and herbal knowledge. Utilizing P-Tuning v2, we fine-tuned the GLM-4-9B-Chat backbone to develop QingNangTCM. A multi-dimensional evaluation framework, assessing accuracy, coverage, consistency, safety, professionalism, and fluency, was established using metrics such as bilingual evaluation understudy (BLEU), recall-oriented understudy for gisting evaluation (ROUGE), metric for evaluation of translation with explicit ordering (METEOR), and LLM-as-a-Judge with expert review. Qualitative analysis was conducted across four simulated clinical scenarios: symptom analysis, disease treatment, herb inquiry, and failure cases. Baseline models included GLM-4-9B-Chat, DeepSeek-V2, HuatuoGPT-II (7B), and GLM-4-9B-Chat (freeze-tuning).
Results:
QingNangTCM achieved the highest scores in BLEU-1/2/3/4 (0.425/0.298/0.137/0.064), ROUGE-1/2 (0.368/0.157), and METEOR (0.218), demonstrating a balanced and superior normalized performance profile of 0.900 across the dimensions of accuracy, coverage, and consistency. Although its ROUGE-L score (0.299) was lower than that of HuatuoGPT-II (7B) (0.351), it significantly outperformed domain-specific models in expert-validated win rates for professionalism (86%) and safety (73%). Qualitative analysis confirmed that the model strictly adheres to the “symptom-syndrome-pathogenesis-treatment” reasoning chain, though occasional misclassifications and hallucinations persisted when dealing with rare medicinal materials and uncommon syndromes.
Conclusion
Combining domain-specific corpus construction with parameter-efficient prompt tuning enhances the reasoning behavior and domain adaptation of LLMs for TCM-related tasks. This work provides a technical framework for the digital organization and intelligent utilization of TCM knowledge, with potential value for supporting diagnostic reasoning and medical education.
2.Feasibility of using blood oxygen level-dependent MRI to diagnose chronic hepatitis b induced early kidney injury:a preliminary study
Xiang WANG ; Huiru JIA ; Huanhuan WU ; Rui ZHANG ; Haoran SUN
Chinese Journal of Radiology 2016;50(9):677-681
Objective To explore the feasibility of blood oxygen level-dependent (BOLD) MRI to detect the chronic hepatitis b-induced early kidney injury. Methods Seventeen clinically diagnosed chronic hepatitis b patients with early kidney injury and 10 healthy volunteers were enrolled in this preliminary study. The 17 patients underwent dynamic nuclear renography and then subdivided into stage 1 kidney injury group (n=7) and stage 2 kidney injury group (n=10). All of the enrolled subjects underwent BOLD examination and T2* relaxation rates (R2*) of renal cortex and medulla of split kidney, and the ratio between them (R2*med/cor) were measured separately. One-way analysis of variance (ANOVA) were performed on the control group and chronic hepatitis b patients group (kidney injury stage 1 and stage 2 group) to compare the difference of renal cortical and medullary R2*values and R2*med/cor ratio. ROC curves were used to evaluate the efficacy of renal cortical and medullary R2* values and R2*med/cor ratio to diagnose the chronic hepatitis b-induced kidney injury. Results The cortical R2*values of control group, stage 1 kidney injury group and stage 2 kidney injury group were(16.87 ± 0.74)/s,(17.88 ± 0.73)/s,(20.29 ± 2.87)/s, respectively;the medullar R2*values of control group, stage 1 kidney injury group and stage 2 kidney injury group were (28.07±1.03)/s,(31.14±2.49)/s,(32.81±3.28)/s, respectively;R2*med/cor of the of control group, stage 1 kidney injury group and stage 2 kidney injury group were 1.67 ± 0.09, 1.75 ± 0.16, 1.63 ± 0.13, respectively, and the differences among the three groups were statistically significant (F values were 17.779, 19.170 and 3.439 , all P<0.05). Furthermore, the renal cortical and medullary R2* values of chronic hepatitis b patients were significantly higher than the control group, and the the renal cortical R2* value of the patients in stage 2 kidney injury group was also higher than the stage 1 kidney injury group. The area under curve (AUC) of ROC of the renal cortical and medullary R2*values and R2*med/cor to diagnose chronic HBV hepatitis-induced early kidney injury were 0.903, 0.949 and 0.526, respectively. Conclusion It's feasible and has great value to use renal BOLD MRI for the diagnosis of chronic hepatitis b-induced early kidney injury, and the renal cortex is more sensitive than the medulla to the kidney injury.

Result Analysis
Print
Save
E-mail