Syndrome Patterns Distribution and Risk Factors of Mixed Hemorrhoids in Traditional Chinese Medicine: A Multicenter Real-world Study Using Large Language Models and Latent Class Analysis
10.13288/j.11-2166/r.2026.07.010
- VernacularTitle:基于大语言模型与潜在类别模型的混合痔中医证型分布特征及危险因素分析——一项多中心真实世界研究
- Author:
Ruyue DENG
1
;
Kang DING
1
;
Yuxin ZHU
1
;
Meng LI
1
;
Huiting ZHU
1
;
Lei DU
1
Author Information
1. Nanjing Hospital of Chinese Medicine Affiliated to Nanjing University of Chinese Medicine,Nanjing,210022
- Publication Type:Journal Article
- Keywords:
mixed hemorrhoids;
syndromes distribution;
large language model;
latent class analysis;
real-world study
- From:
Journal of Traditional Chinese Medicine
2026;67(7):755-763
- CountryChina
- Language:Chinese
-
Abstract:
ObjectiveTo develop a standardized classification model for traditional Chinese medicine (TCM) syndrome patterns of mixed hemorrhoids using multi-center real-world data, and unveil their distribution patterns and core risk factors, thereby providing evidence-based support for standardizing TCM syndrome differentiation and implementing precision interventions. MethodsA multi-center cross-sectional study was conducted, enrolling 13 283 mixed hemorrhoid patients from eight hospitals in Jiangsu Province between September 1st, 2023 and December 31st, 2024. DeepSeek-R1-Distill-Qwen-7B and LLaMA-3.3 large language models (LLM) were integrated with latent class analysis (LCA) to perform unsupervised learning and latent class modeling of TCM symptomatology. Potential risk factors were screened via univariate analysis, followed by logistic regression to identify independent risk factors for each syndrome pattern. ResultsThe model's performance indicators were stable and reliable across different clinical data types,i.e. in the outpatient records, past medical history (F1=99.7%), current medical history (F1=94.9%), and specialist examination (F1=90.7%); in inpatient records, past medical history (F1=98.2%), current medical history (F1=91.2%), specialist examination (F1=90.3%), and discharge status (F1=90.6%). Latent class mode-ling identified four core TCM syndrome patterns including spleen deficiency and qi sinking syndrome (915 cases, 6.89%), damp-heat pouring downward syndrome (10 820 cases, 81.46%), qi stagnation and blood stasis syndrome (1252 cases, 9.43%), and wind injuring intestinal collaterals syndrome (296 cases, 2.22%), with respective latent class probabilities of 0.069, 0.815, 0.094, and 0.022. Logistic regression demonstrated that gender, age, disease duration, hypertension, diabetes, hyperlipidemia, constipation, smoking history, and alcohol consumption were independent risk factors for pattern differentiation (P<0.05). The efficacy validation evaluation revealed that the cure rates for patients with spleen deficiency and qi sinking syndrome and qi stagnation and blood stasis syndrome were higher than those for patients with damp-heat pouring downward syndrome (adjusted P<0.05), with no statistically significant differences among other syndrome patterns. ConclusionDamp-heat pouring downward syndrome is the predominant syndrome in mixed hemorrhoids. Gender, age, disease duration, hypertension, diabetes, hyperlipi-demia, constipation, smoking history, and alcohol consumption are independent risk factors for the differentiation of syndrome types.