1.Pathogenesis Reasoning Chain-of-thought Supervision for Large Language Models: Syndrome Manifestation Recognition and Multidimensional Evaluation in Spleen-stomach Disorders
Shu-Han YANG ; Yu-Xin HU ; Xin-Yu YU ; Yu-Ying TU ; Yi-Chang ZANG ; Pan-Fei LI
Progress in Biochemistry and Biophysics 2026;53(5):1240-1263
ObjectiveThe essence of syndrome manifestation recognition in traditional Chinese medicine (TCM) is to infer the body’s latent pathogenesis state from clinical observational information, rather than to perform simple label matching. However, previous studies have largely modeled this task as syndrome pattern classification within a fixed label space, which does not adequately reflect the cognition process of TCM syndrome differentiation centered on pathogenesis reasoning, and is also insufficient to capture the openness, semantic variability, and cross-disease reusability of syndrome manifestation expression. This study aimed to investigate whether introducing pathogenesis reasoning chain-of-thought (PR-CoT) supervision into large language models (LLMs) could improve the quality and cognitive consistency of syndrome manifestation recognition and support cross-disease transfer. MethodsSyndrome manifestation recognition was formulated as a conditional generation task under the framework of clinical observational information (X)→pathogenesis structure (Z)→syndrome pattern output (Y), where Z serves as an explicit intermediate structural variable linking the clinical evidence and syndrome judgment. Within this framework, a PR-CoT-supervised dataset for syndrome manifestation recognition was constructed based on medical case records of spleen-stomach disorders. After preprocessing, information extraction, manual proofreading, and data cleaning, the dataset comprised 4 800 training cases, 400 development cases, and 400 test cases. Each sample was annotated with a structured PR-CoT consisting of three progressive levels: clinical information summarization, comprehensive pathogenesis analysis, and syndrome pattern output. Supervised fine-tuning was conducted on open-source LLMs, with an end-to-end model serving as the baseline. Qwen3-32B was used as the primary experimental model, and Qwen3-14B as the scale comparison model. A progressive multidimensional evaluation framework was further established, comprising a structural parsing level, a semantic similarity level, and an expert blind review level. At the structural parsing level, syndrome pattern expressions were decomposed into structural elements and evaluated using Precision, Recall, F1 score, and Jaccard similarity. At the semantic similarity level, independent LLMs scored the theoretical proximity between predicted and reference syndrome patterns. At the expert blind review level, three TCM experts independently evaluated model outputs on two dimensions: syndrome differentiation consistency and terminology standardization of syndrome patterns. In addition, zero-shot cross-disease transfer evaluation was conducted on gynecological and heart-system disorder test sets. ResultsAt the structural parsing level, PR-CoT supervision did not lead to a stable improvement in the element-wise overlap of syndrome pattern structural components. Compared with the corresponding baselines, neither Qwen3-32B nor Qwen3-14B showed consistent advantages in structural matching metrics after the introduction of PR-CoT supervision. In contrast, at the semantic similarity level, PR-CoT supervision produced stable positive gains across different model scales and evaluation systems. The average semantic score of Qwen3-32B increased from 6.425 8 in the baseline model to 6.585 0 after PR-CoT supervision, and that of Qwen3-14B increased from 5.870 0 to 5.964 2. At the expert blind review level, the overall score of Qwen3-32B (PR-CoT) was 7.026 0±0.107 7, higher than 6.416 3±0.288 9 for its baseline. In zero-shot cross-disease testing, the PR-CoT model still showed advantages in semantic evaluation and expert evaluation on both gynecological and heart-system disorder test sets, indicating a certain degree of transferability. ConclusionThe benefits of PR-CoT supervision are mainly reflected in TCM semantic consistency and clinical plausibility, rather than in improved hard matching of structural elements. These findings support understanding syndrome manifestation recognition as a process of generating and expressing latent pathogenesis structures, rather than as a classification task within a traditional fixed label space. By introducing pathogenesis reasoning as an explicit intermediate structure into the modeling process and combining it with a progressive multidimensional evaluation framework, this study provides a methodological pathway for intelligent TCM syndrome differentiation that integrates theoretical alignment, interpretability, and multi-level evaluation.
2.Pathogenesis Reasoning Chain-of-thought Supervision for Large Language Models: Syndrome Manifestation Recognition and Multidimensional Evaluation in Spleen-stomach Disorders
Shu-Han YANG ; Yu-Xin HU ; Xin-Yu YU ; Yu-Ying TU ; Yi-Chang ZANG ; Pan-Fei LI
Progress in Biochemistry and Biophysics 2026;53(5):1240-1263
ObjectiveThe essence of syndrome manifestation recognition in traditional Chinese medicine (TCM) is to infer the body’s latent pathogenesis state from clinical observational information, rather than to perform simple label matching. However, previous studies have largely modeled this task as syndrome pattern classification within a fixed label space, which does not adequately reflect the cognition process of TCM syndrome differentiation centered on pathogenesis reasoning, and is also insufficient to capture the openness, semantic variability, and cross-disease reusability of syndrome manifestation expression. This study aimed to investigate whether introducing pathogenesis reasoning chain-of-thought (PR-CoT) supervision into large language models (LLMs) could improve the quality and cognitive consistency of syndrome manifestation recognition and support cross-disease transfer. MethodsSyndrome manifestation recognition was formulated as a conditional generation task under the framework of clinical observational information (X)→pathogenesis structure (Z)→syndrome pattern output (Y), where Z serves as an explicit intermediate structural variable linking the clinical evidence and syndrome judgment. Within this framework, a PR-CoT-supervised dataset for syndrome manifestation recognition was constructed based on medical case records of spleen-stomach disorders. After preprocessing, information extraction, manual proofreading, and data cleaning, the dataset comprised 4 800 training cases, 400 development cases, and 400 test cases. Each sample was annotated with a structured PR-CoT consisting of three progressive levels: clinical information summarization, comprehensive pathogenesis analysis, and syndrome pattern output. Supervised fine-tuning was conducted on open-source LLMs, with an end-to-end model serving as the baseline. Qwen3-32B was used as the primary experimental model, and Qwen3-14B as the scale comparison model. A progressive multidimensional evaluation framework was further established, comprising a structural parsing level, a semantic similarity level, and an expert blind review level. At the structural parsing level, syndrome pattern expressions were decomposed into structural elements and evaluated using Precision, Recall, F1 score, and Jaccard similarity. At the semantic similarity level, independent LLMs scored the theoretical proximity between predicted and reference syndrome patterns. At the expert blind review level, three TCM experts independently evaluated model outputs on two dimensions: syndrome differentiation consistency and terminology standardization of syndrome patterns. In addition, zero-shot cross-disease transfer evaluation was conducted on gynecological and heart-system disorder test sets. ResultsAt the structural parsing level, PR-CoT supervision did not lead to a stable improvement in the element-wise overlap of syndrome pattern structural components. Compared with the corresponding baselines, neither Qwen3-32B nor Qwen3-14B showed consistent advantages in structural matching metrics after the introduction of PR-CoT supervision. In contrast, at the semantic similarity level, PR-CoT supervision produced stable positive gains across different model scales and evaluation systems. The average semantic score of Qwen3-32B increased from 6.425 8 in the baseline model to 6.585 0 after PR-CoT supervision, and that of Qwen3-14B increased from 5.870 0 to 5.964 2. At the expert blind review level, the overall score of Qwen3-32B (PR-CoT) was 7.026 0±0.107 7, higher than 6.416 3±0.288 9 for its baseline. In zero-shot cross-disease testing, the PR-CoT model still showed advantages in semantic evaluation and expert evaluation on both gynecological and heart-system disorder test sets, indicating a certain degree of transferability. ConclusionThe benefits of PR-CoT supervision are mainly reflected in TCM semantic consistency and clinical plausibility, rather than in improved hard matching of structural elements. These findings support understanding syndrome manifestation recognition as a process of generating and expressing latent pathogenesis structures, rather than as a classification task within a traditional fixed label space. By introducing pathogenesis reasoning as an explicit intermediate structure into the modeling process and combining it with a progressive multidimensional evaluation framework, this study provides a methodological pathway for intelligent TCM syndrome differentiation that integrates theoretical alignment, interpretability, and multi-level evaluation.
3.Analysis of Tongue and Face Image Features of Anemic Women and Construction of Risk-Screening Model.
Hong Yuan FU ; Yi CHUN ; Ya Han ZHANG ; Yu WANG ; Yu Lin SHI ; Tao JIANG ; Xiao Juan HU ; Li Ping TU ; Yong Zhi LI ; Jia Tuo XU
Biomedical and Environmental Sciences 2025;38(8):935-951
OBJECTIVE:
To identify the key features of facial and tongue images associated with anemia in female populations, establish anemia risk-screening models, and evaluate their performance.
METHODS:
A total of 533 female participants (anemic and healthy) were recruited from Shuguang Hospital. Facial and tongue images were collected using the TFDA-1 tongue and face diagnosis instrument. Color and texture features from various parts of facial and tongue images were extracted using Face Diagnosis Analysis System (FDAS) and Tongue Diagnosis Analysis System version 2.0 (TDAS v2.0). Least Absolute Shrinkage and Selection Operator (LASSO) regression was used for feature selection. Ten machine learning models and one deep learning model (ResNet50V2 + Conv1D) were developed and evaluated.
RESULTS:
Anemic women showed lower a-values, higher L- and b-values across all age groups. Texture features analysis showed that women aged 30-39 with anemia had higher angular second moment (ASM)and lower entropy (ENT) values in facial images, while those aged 40-49 had lower contrast (CON), ENT, and MEAN values in tongue images but higher ASM. Anemic women exhibited age-related trends similar to healthy women, with decreasing L-values and increasing a-, b-, and ASM-values. LASSO identified 19 key features from 62. Among classifiers, the Artificial Neural Network (ANN) model achieved the best performance [area under the curve (AUC): 0.849, accuracy: 0.781]. The ResNet50V2 model achieved comparable results [AUC: 0.846, accuracy: 0.818].
CONCLUSION
Differences in facial and tongue images suggest that color and texture features can serve as potential TCM phenotype and auxiliary diagnostic indicators for female anemia.
Humans
;
Female
;
Tongue/diagnostic imaging*
;
Adult
;
Anemia/diagnosis*
;
Middle Aged
;
Face/diagnostic imaging*
;
Young Adult
;
Machine Learning
4.Effects of continuous positive airway pressure on maternal and neonatal outcomes in pregnant women with obstructive sleep apnea syndrome
Zelin TU ; Rui BAI ; Linyan ZHANG ; Jingyu WANG ; Shenda HONG ; Jingjing YANG ; Jun WEI ; Yan WANG ; Yanan LIU ; Xiaosong DONG ; Fang HAN ; Guoli LIU
Chinese Journal of Obstetrics and Gynecology 2025;60(3):171-176
Objective:To analyze the effect of continuous positive airway pressure (CPAP) on maternal and neonatal outcomes in pregnant women with obstructive sleep apnea syndrome (OSAS), especially on the incidence of hypertensive disorder in pregnancy (HDP) in women with moderate to severe OSAS.Methods:A total of 180 pregnant women with OSAS who were diagnosed through sleep monitoring during pregnancy due to high-risk factors of OSAS and registered in Peking University People′s Hospital from January 2021 to May 2024 were selected as the study subjects. Clinical data were collected from medical records for retrospective analysis. According to whether they received standardized treatment with CPAP, they were divided into the CPAP treatment group (42 cases) and the control group (138 cases). The CPAP treatment group consisted of 9 pregnant women with moderate to severe OSAS, while the control group consisted of 34 pregnant women with moderate to severe OSAS. The maternal and neonatal outcomes, the incidence of HDP, placental weight after delivery and placental weight/neonatal birth weight ratio were compared between the two groups.Results:(1) The average gestational age of pregnant women in the CPAP treatment group was higher than that in the control group [(38.7±1.0) vs (38.0±1.4) weeks], the proportion of infants small for gestational age (SGA) in the CPAP treatment group was lower [0 (0/42) vs 12.3% (17/138)], and the birth weight of infants in the CPAP treatment group was bigger [(3 396±475) vs (3 082±710) g); the differences between the two groups were statistically significant (all P<0.05). There were no significant differences between the CPAP treatment group and the control group in terms of delivery mode, rates of postpartum hemorrhage and preterm birth, umbilical artery blood gas analysis pH<7.1, lactate≥6.0 mmol/L, base excess<-12.0 mmol/L and the incidence of gestational diabetes mellitus and HDP (all P>0.05). (2) The placental weight of the CPAP treatment group was significantly lower than that of the control group [(554.0±70.6) vs (615.7±119.1) g], the placental weight/newborn birth weight ratio of the CPAP treatment group was significantly lower than that of the control group (median: 0.17 vs 0.19), and the differences were statistically significant (all P<0.05). (3) The incidence of HDP in pregnant women with moderate to severe OSAS in the CPAP treatment group was lower than that in the control group [1/9 vs 61.8% (21/34)], and the difference was statistically significant ( P<0.05). Conclusions:CPAP treatment could prolong the gestational age in pregnant women with OSAS, reduce the incidence of SGA, increase the birth weight of infants, and reduce the incidence of HDP in pregnant women with moderate to severe OSAS, and is worth promoting in clinical practice. The improvement of neonatal outcomes by CPAP treatment is closely related to the placenta, which is worthy of further exploration.
5.Rifaximin curative effect and mechanism on monocrotaline-induced hepatic sinusoidal obstruction syndrome in mice
Si ZHAO ; Jiangqiang XIAO ; Han ZHANG ; Jingjing TU ; Qin YIN ; Yuzheng ZHUGE
Chinese Journal of Hepatology 2025;33(2):177-185
Objective:To investigate the curative effect and possible mechanism of rifaximin treatment on monocrotaline-induced hepatic sinusoidal obstruction syndrome (HSOS) in mice.Methods:Twenty-four male C57BL/6J mice were divided into three groups and treated with solvent control, monocrotaline, and rifaximin, respectively. The histopathological changes of the liver and intestine were observed by hematoxylin-eosin staining. The differences were compared in liver parameters, serum liver enzymes, inflammatory factors, apoptotic factors, gut microbiota, and gut tight junction proteins among three groups of mice. The inter-group comparison was conducted using a t-test and one-way analysis of variance.Results:The rifaximin-treated group had significantly improved liver histopathology. The serological levels of alanine aminotransferase and aspartate aminotransferase were (559.04±89.42) U/L and (676.90±106.25) U/L, respectively, which were significantly lower than those in the PA-HSOS model group [(846.05±148.46) U/L and (953.87±58.10) U/L, P<0.05], and were accompanied by lower levels of apoptotic cells and inflammatory factors. Additionally, the rifaximin-treated mice group gut microbiota had higher diversity compared with the PA-HSOS group ( P<0.05), and the Shannon index was 7.77±0.10 and 7.16±0.07, respectively, indicating apparent differences in microbiota among different groups. The abundance of Firmicutes in the rifaximin group was 39.58%±0.56%, which was significantly higher than that in the model group (24.25%±0.64%, P<0.05), while the abundance of Bacteroidetes was 54.7%±0.41%, which was significantly lower than that in the model group (70.92%±0.49%, P<0.05). Simultaneously, the expressions of gut tight junction proteins ZO-1 and Occludin showed an upward trend and validated transcription levels compared to the model group following rifaximin intervention ( P<0.05). Conclusion:Rifaximin can alleviate monocrotaline-induced hepatic sinusoidal obstruction syndrome in mice, and its mechanism may be via gut microbiota regulation, which in turn plays a role in improving intestinal barrier function.
6.Chinese expert consensus on integrated case management by a multidisciplinary team in CAR-T cell therapy for lymphoma.
Sanfang TU ; Ping LI ; Heng MEI ; Yang LIU ; Yongxian HU ; Peng LIU ; Dehui ZOU ; Ting NIU ; Kailin XU ; Li WANG ; Jianmin YANG ; Mingfeng ZHAO ; Xiaojun HUANG ; Jianxiang WANG ; Yu HU ; Weili ZHAO ; Depei WU ; Jun MA ; Wenbin QIAN ; Weidong HAN ; Yuhua LI ; Aibin LIANG
Chinese Medical Journal 2025;138(16):1894-1896
7.Efficacy and safety of avatrombopag in the treatment of thrombocytopenia after umbilical cord blood transplantation.
Aijie HUANG ; Guangyu SUN ; Baolin TANG ; Yongsheng HAN ; Xiang WAN ; Wen YAO ; Kaidi SONG ; Yaxin CHENG ; Weiwei WU ; Meijuan TU ; Yue WU ; Tianzhong PAN ; Xiaoyu ZHU
Chinese Medical Journal 2025;138(9):1072-1083
BACKGROUND:
Delayed platelet engraftment is a common complication after umbilical cord blood transplantation (UCBT), and there is no standard therapy. Avatrombopag (AVA) is a second-generation thrombopoietin (TPO) receptor agonist (TPO-RA) that has shown efficacy in immune thrombocytopenia (ITP). However, few reports have focused on its efficacy in patients diagnosed with thrombocytopenia after allogeneic hematopoietic stem cell transplantation (allo-HSCT).
METHODS:
We conducted a retrospective study at the First Affiliated Hospital of the University of Science and Technology of China to evaluate the efficacy of AVA as a first-line TPO-RA in 65 patients after UCBT; these patients were compared with 118 historical controls. Response rates, platelet counts, megakaryocyte counts in bone marrow, bleeding events, adverse events and survival rates were evaluated in this study. Platelet reconstitution differences were compared between different medication groups. Multivariable analysis was used to explore the independent beneficial factors for platelet implantation.
RESULTS:
Fifty-two patients were given AVA within 30 days post-UCBT, and the treatment was continued for more than 7 days to promote platelet engraftment (AVA group); the other 13 patients were given AVA for secondary failure of platelet recovery (SFPR group). The median time to platelet engraftment was shorter in the AVA group than in the historical control group (32.5 days vs . 38.0 days, Z = 2.095, P = 0.036). Among the 52 patients in the AVA group, 46 achieved an overall response (OR) (88.5%), and the cumulative incidence of OR was 91.9%. Patients treated with AVA only had a greater 60-day cumulative incidence of platelet engraftment than patients treated with recombinant human thrombopoietin (rhTPO) only or rhTPO combined with AVA (95.2% vs . 84.5% vs . 80.6%, P <0.001). Patients suffering from SFPR had a slightly better cumulative incidence of OR (100%, P = 0.104). Patients who initiated AVA treatment within 14 days post-UCBT had a better 60-day cumulative incidence of platelet engraftment than did those who received AVA after 14 days post-UCBT (96.6% vs . 73.9%, P = 0.003).
CONCLUSION
Compared with those in the historical control group, our results indicate that AVA could effectively promote platelet engraftment and recovery after UCBT, especially when used in the early period (≤14 days post-UCBT).
Humans
;
Female
;
Male
;
Thrombocytopenia/etiology*
;
Adult
;
Retrospective Studies
;
Cord Blood Stem Cell Transplantation/adverse effects*
;
Middle Aged
;
Adolescent
;
Young Adult
;
Thiazoles/adverse effects*
;
Platelet Count
;
Receptors, Thrombopoietin/agonists*
;
Child
;
Thiophenes
8.Five-year outcomes of metabolic surgery in Chinese subjects with type 2 diabetes.
Yuqian BAO ; Hui LIANG ; Pin ZHANG ; Cunchuan WANG ; Tao JIANG ; Nengwei ZHANG ; Jiangfan ZHU ; Haoyong YU ; Junfeng HAN ; Yinfang TU ; Shibo LIN ; Hongwei ZHANG ; Wah YANG ; Jingge YANG ; Shu CHEN ; Qing FAN ; Yingzhang MA ; Chiye MA ; Jason R WAGGONER ; Allison L TOKARSKI ; Linda LIN ; Natalie C EDWARDS ; Tengfei YANG ; Rongrong ZHANG ; Weiping JIA
Chinese Medical Journal 2025;138(4):493-495
9.Identification of novel pathogenic variants in genes related to pancreatic β cell function: A multi-center study in Chinese with young-onset diabetes.
Fan YU ; Yinfang TU ; Yanfang ZHANG ; Tianwei GU ; Haoyong YU ; Xiangyu MENG ; Si CHEN ; Fengjing LIU ; Ke HUANG ; Tianhao BA ; Siqian GONG ; Danfeng PENG ; Dandan YAN ; Xiangnan FANG ; Tongyu WANG ; Yang HUA ; Xianghui CHEN ; Hongli CHEN ; Jie XU ; Rong ZHANG ; Linong JI ; Yan BI ; Xueyao HAN ; Hong ZHANG ; Cheng HU
Chinese Medical Journal 2025;138(9):1129-1131
10.Novel CD19 Fast-CAR-T cells vs. CD19 conventional CAR-T cells for the treatment of relapsed/refractory CD19-positive B-cell acute lymphoblastic leukemia.
Xu TAN ; Jishi WANG ; Shangjun CHEN ; Li LIU ; Yuhua LI ; Sanfang TU ; Hai YI ; Jian ZHOU ; Sanbin WANG ; Ligen LIU ; Jian GE ; Yongxian HU ; Xiaoqi WANG ; Lu WANG ; Guo CHEN ; Han YAO ; Cheng ZHANG ; Xi ZHANG
Chinese Medical Journal 2025;138(19):2491-2497
BACKGROUND:
Treatment with chimeric antigen receptor-T (CAR-T) cells has shown promising effectiveness in patients with relapsed/refractory B-cell acute lymphoblastic leukemia (R/R B-ALL), although the process of preparing for this therapy usually takes a long time. We have recently created CD19 Fast-CAR-T (F-CAR-T) cells, which can be produced within a single day. The objective of this study was to evaluate and contrast the effectiveness and safety of CD19 F-CAR-T cells with those of CD19 conventional CAR-T cells in the management of R/R B-ALL.
METHODS:
A multicenter, retrospective analysis of the clinical data of 44 patients with R/R B-ALL was conducted. Overall, 23 patients were administered with innovative CD19 F-CAR-T cells (F-CAR-T group), whereas 21 patients were given CD19 conventional CAR-T cells (C-CAR-T group). We compared the rates of complete remission (CR), minimal residual disease (MRD)-negative CR, leukemia-free survival (LFS), overall survival (OS), and the incidence of cytokine release syndrome (CRS) and immune effector cell-associated neurotoxicity syndrome (ICANS) between the two groups.
RESULTS:
Compared with the C-CAR-T group, the F-CAR-T group had significantly higher CR and MRD-negative rates (95.7% and 91.3%, respectively; 71.4% and 66.7%, respectively; P = 0.036 and P = 0.044). No significant differences were observed in the 1-year or 2-year LFS or OS rates between the two groups: the 1-year and 2-year LFS for the F-CAR-T group vs.C-CAR-T group were 47.8% and 43.5% vs. 38.1% and 23.8% (P = 0.384 and P = 0.216), while the 1-year and 2-year OS rates were 65.2% and 56.5% vs. 52.4% and 47.6% (P = 0.395 and P = 0.540). Additionally, among CR patients who underwent allogeneic hematopoietic stem cell transplantation (allo-HSCT) following CAR-T-cell therapy, there were no significant differences in the 1-year or 2-year LFS or OS rates: 57.1% and 50.0% vs. 47.8% and 34.8% (P = 0.506 and P = 0.356), 64.3% and 57.1% vs. 65.2% and 56.5% (P = 0.985 and P = 0.883), respectively. The incidence of CRS was greater in the F-CAR-T group (91.3%) than in the C-CAR-T group (66.7%) (P = 0.044). The incidence of ICANS was also greater in the F-CAR-T group (30.4%) than in the C-CAR-T group (9.5%) (P = 0.085), but no treatment-related deaths occurred in the two groups.
CONCLUSION
Compared with C-CAR-T-cell therapy, F-CAR-T-cell therapy has a superior remission rate but also leads to a tolerably increased incidence of CRS/ICANS. Further research is needed to explore the function of allo-HSCT as an intermediary therapy after CAR-T-cell therapy.

Result Analysis
Print
Save
E-mail