1.Correlation Analysis of Huanglian Jiedu Wan on Syndrome Improvement and Clinical Biomarkers of "Excess Heat-Toxicity" Based on Machine Learning Model
Qi LI ; Keke LUO ; Baolin BIAN ; Hongyu YU ; Mengxiao WANG ; Mengyao TIAN ; Wen XIA ; Yuan MA ; Xinfang ZHANG ; Pengyue LI ; Nan SI ; Hongjie WANG ; Yanyan ZHOU
Chinese Journal of Experimental Traditional Medical Formulae 2026;32(8):162-173
ObjectiveThis paper aims to find the identified and validated clinical biomarker data building upon a clinical study of early-phase phase Ⅱ and investigate the correlation analysis of Huanglian Jiedu Wan on syndrome improvement and clinical biomarkers in the treatment of "excess heat-toxicity" based on a machine learning model. Additionally, the effective prediction of clinical biomarker values for the main symptoms of the "excess heat-toxicity" syndrome was assessed. MethodsA total of 229 patients meeting the inclusion criteria for "excess heat-toxicity" syndrome were randomly divided into the Huanglian Jiedu Wan group and the placebo group. Syndrome score transition matrices were constructed for the Huanglian Jiedu Wan group and the placebo group based on three main symptoms of "excess heat-toxicity" syndrome, such as oral ulcers, sore throat, and gum swelling and pain. Data from the patients with these three syndromes were also integrated for an overall analysis. The corresponding syndrome score transition matrices were further constructed to visualize symptom change trends of the patients in the two groups via heatmaps. Based on the identified and validated clinical biomarkers related to inflammation, oxidative stress, and energy metabolism in the early phase, Spearman correlation analysis was employed to analyze and evaluate the associations between clinical biomarkers and syndrome improvement. Key clinical biomarkers reflecting the effect of Huanglian Jiedu Wan were screened through the comparison of differences between groups. An extreme gradient boosting (XGBoost) algorithm was used to develop a prediction model for main symptom classification, with classification performance evaluated through 10-fold cross-validation. Feature importance analysis was applied to identify variables with the greatest contribution to the prediction result. ResultsThe syndrome transition matrix results indicated that the Huanglian Jiedu Wan group showed a superior effect to the placebo group in improving oral ulcers, sore throat, and overall symptoms, with significant effects observed especially in sore throat and overall symptom analyses (P<0.01). Spearman correlation analysis revealed that several clinical biomarkers positively correlated with "excess heat-toxicity" syndrome and its main symptom improvement, were also called "heat-related biomarkers", including succinic acid, α-ketoglutaric acid, glycine, lactic acid, adenosine monophosphate (AMP), tumor necrosis factor-α (TNF-α), interferon-γ (IFN-γ), interleukin-1β (IL-1β), interleukin-4 (IL-4), interleukin-6 (IL-6), interleukin-8 (IL-8), interleukin-10 (IL-10), and so on. Conversely, clinical biomarkers negatively correlated with symptom severity, were also called "heat-clearing related biomarkers" after administration of Huanglian Jiedu Wan, including malic acid, fumaric acid, cis-aconitic acid, adrenocorticotropic hormone (ACTH), IL-1β, IL-4, IL-8, succinic acid, and citric acid. The XGBoost classification model using all 52 biomarkers as variables achieved an average test accuracy of 0.754 and an average F1 score of 0.777. Feature importance analysis identified the scores of glutamic acid in saliva and IL-6 were the highest in all the variables, with importance scores of 0.081 and 0.080, respectively. After screening out 14 key variables and optimizing the parameters, model performance improved to an average accuracy of 0.758 and an F1 score of 0.798. Feature importance analysis further determined that the glutamic acid in saliva and IL-6 showed obvious changes after screening the variables, confirming the good syndrome prediction ability of the model constructed by these key clinical biomarkers. ConclusionThis study systematically elucidates the correlation between syndrome improvement and clinical biomarkers of Huanglian Jiedu Wan in the treatment of "excess heat-toxicity" syndrome. An XGBoost classification model based on key clinical biomarkers is successfully established, achieving effective prediction of the symptoms related to the "excess heat-toxicity" syndrome such as oral ulcers and sore throat and providing a new insight for objective identification of traditional Chinese medicine syndromes.
2.Identification and Analysis of bHLH Genes Related to Color Formation of Gastrodia elata Stem
Xue JIANG ; Dandan RAN ; Xiuwen WANG ; Xiaobo ZHANG ; Xiaohong OU ; Jie PAN ; Tao ZHOU ; Zhen OUYANG ; Jiao XU
Chinese Journal of Experimental Traditional Medical Formulae 2026;32(8):202-209
ObjectiveGastrodia elata has evolved ecological types with shortened rhizome internodes and diversified flower and fruit coloration in response to different altitudes. Studying the genetic mechanisms of different ecotype germplasm is significant for guiding variety breeding in different cultivation areas. MethodsThe bHLH gene family was identified based on the whole-genome datasets of G. elata f. elata and G. elata f. glauca. Subsequently, the gene family members were subject to analysis, including gene structure, chromosomal localization, cis-acting elements, gene synteny, and phylogeny. Combined with transcriptome data and quantitative Real-time PCR, the expression patterns of bHLH genes in the stems of the different G. elata ecotype germplasm were analyzed. Finally, correlation analysis was conducted between gene expression patterns and color to obtain the key bHLH genes regulating the color formation of stem. ResultsA total of 63 bHLH genes were identified in both G elata f. elata and G. elata f. glauca, unevenly distributed across 17 chromosomes and clustered into 16 subfamilies, with significant expansion in some family members. Obvious inversions of bHLH genes on the same chromosome and interchromosomal translocations were detected in the two ecotype germplasm. Among these genes, 12 bHLH genes (such as bHLH62-3 and bHLH74) were associated with the bright yellow color of G elata f. elata stem, while 9 bHLH genes (such as PIL13, UNE12, and bHLH130) were correlated with the red color of G. elata f. glauca stem. Compared to G. elata f. glauca, the bHLH48 expression level was significantly higher in flowers and scale leaves of G elata f. elata, and the bHLH62-3 expression level was significantly higher in all organs of G elata f. elata. ConclusionsFunctional pathway divergence of the bHLH family members has occurred across different chromosomes in G elata f. elata and G. elata f. glauca. Through synergism or antagonism with other genes, 21 bHLH genes participate in the coloration metabolic pathway regulation of stems, flowers, and fruits. Specifically, bHLH62-3 is involved in regulating stem color differentiation in the anthocyanin biosynthesis pathway of G. elata, thus relevant to the color formation of stem. Additionally, GebHLH48 positively regulates flowering-related pathways to promote the early-flowering phenotype of G. elata f. elata. These findings have laid the foundation for analyzing the genetic regulatory mechanisms underlying the color formation of the G. elata stem.
3.Correlation Analysis of Huanglian Jiedu Wan on Syndrome Improvement and Clinical Biomarkers of "Excess Heat-Toxicity" Based on Machine Learning Model
Qi LI ; Keke LUO ; Baolin BIAN ; Hongyu YU ; Mengxiao WANG ; Mengyao TIAN ; Wen XIA ; Yuan MA ; Xinfang ZHANG ; Pengyue LI ; Nan SI ; Hongjie WANG ; Yanyan ZHOU
Chinese Journal of Experimental Traditional Medical Formulae 2026;32(8):162-173
ObjectiveThis paper aims to find the identified and validated clinical biomarker data building upon a clinical study of early-phase phase Ⅱ and investigate the correlation analysis of Huanglian Jiedu Wan on syndrome improvement and clinical biomarkers in the treatment of "excess heat-toxicity" based on a machine learning model. Additionally, the effective prediction of clinical biomarker values for the main symptoms of the "excess heat-toxicity" syndrome was assessed. MethodsA total of 229 patients meeting the inclusion criteria for "excess heat-toxicity" syndrome were randomly divided into the Huanglian Jiedu Wan group and the placebo group. Syndrome score transition matrices were constructed for the Huanglian Jiedu Wan group and the placebo group based on three main symptoms of "excess heat-toxicity" syndrome, such as oral ulcers, sore throat, and gum swelling and pain. Data from the patients with these three syndromes were also integrated for an overall analysis. The corresponding syndrome score transition matrices were further constructed to visualize symptom change trends of the patients in the two groups via heatmaps. Based on the identified and validated clinical biomarkers related to inflammation, oxidative stress, and energy metabolism in the early phase, Spearman correlation analysis was employed to analyze and evaluate the associations between clinical biomarkers and syndrome improvement. Key clinical biomarkers reflecting the effect of Huanglian Jiedu Wan were screened through the comparison of differences between groups. An extreme gradient boosting (XGBoost) algorithm was used to develop a prediction model for main symptom classification, with classification performance evaluated through 10-fold cross-validation. Feature importance analysis was applied to identify variables with the greatest contribution to the prediction result. ResultsThe syndrome transition matrix results indicated that the Huanglian Jiedu Wan group showed a superior effect to the placebo group in improving oral ulcers, sore throat, and overall symptoms, with significant effects observed especially in sore throat and overall symptom analyses (P<0.01). Spearman correlation analysis revealed that several clinical biomarkers positively correlated with "excess heat-toxicity" syndrome and its main symptom improvement, were also called "heat-related biomarkers", including succinic acid, α-ketoglutaric acid, glycine, lactic acid, adenosine monophosphate (AMP), tumor necrosis factor-α (TNF-α), interferon-γ (IFN-γ), interleukin-1β (IL-1β), interleukin-4 (IL-4), interleukin-6 (IL-6), interleukin-8 (IL-8), interleukin-10 (IL-10), and so on. Conversely, clinical biomarkers negatively correlated with symptom severity, were also called "heat-clearing related biomarkers" after administration of Huanglian Jiedu Wan, including malic acid, fumaric acid, cis-aconitic acid, adrenocorticotropic hormone (ACTH), IL-1β, IL-4, IL-8, succinic acid, and citric acid. The XGBoost classification model using all 52 biomarkers as variables achieved an average test accuracy of 0.754 and an average F1 score of 0.777. Feature importance analysis identified the scores of glutamic acid in saliva and IL-6 were the highest in all the variables, with importance scores of 0.081 and 0.080, respectively. After screening out 14 key variables and optimizing the parameters, model performance improved to an average accuracy of 0.758 and an F1 score of 0.798. Feature importance analysis further determined that the glutamic acid in saliva and IL-6 showed obvious changes after screening the variables, confirming the good syndrome prediction ability of the model constructed by these key clinical biomarkers. ConclusionThis study systematically elucidates the correlation between syndrome improvement and clinical biomarkers of Huanglian Jiedu Wan in the treatment of "excess heat-toxicity" syndrome. An XGBoost classification model based on key clinical biomarkers is successfully established, achieving effective prediction of the symptoms related to the "excess heat-toxicity" syndrome such as oral ulcers and sore throat and providing a new insight for objective identification of traditional Chinese medicine syndromes.
4.Identification and Analysis of bHLH Genes Related to Color Formation of Gastrodia elata Stem
Xue JIANG ; Dandan RAN ; Xiuwen WANG ; Xiaobo ZHANG ; Xiaohong OU ; Jie PAN ; Tao ZHOU ; Zhen OUYANG ; Jiao XU
Chinese Journal of Experimental Traditional Medical Formulae 2026;32(8):202-209
ObjectiveGastrodia elata has evolved ecological types with shortened rhizome internodes and diversified flower and fruit coloration in response to different altitudes. Studying the genetic mechanisms of different ecotype germplasm is significant for guiding variety breeding in different cultivation areas. MethodsThe bHLH gene family was identified based on the whole-genome datasets of G. elata f. elata and G. elata f. glauca. Subsequently, the gene family members were subject to analysis, including gene structure, chromosomal localization, cis-acting elements, gene synteny, and phylogeny. Combined with transcriptome data and quantitative Real-time PCR, the expression patterns of bHLH genes in the stems of the different G. elata ecotype germplasm were analyzed. Finally, correlation analysis was conducted between gene expression patterns and color to obtain the key bHLH genes regulating the color formation of stem. ResultsA total of 63 bHLH genes were identified in both G elata f. elata and G. elata f. glauca, unevenly distributed across 17 chromosomes and clustered into 16 subfamilies, with significant expansion in some family members. Obvious inversions of bHLH genes on the same chromosome and interchromosomal translocations were detected in the two ecotype germplasm. Among these genes, 12 bHLH genes (such as bHLH62-3 and bHLH74) were associated with the bright yellow color of G elata f. elata stem, while 9 bHLH genes (such as PIL13, UNE12, and bHLH130) were correlated with the red color of G. elata f. glauca stem. Compared to G. elata f. glauca, the bHLH48 expression level was significantly higher in flowers and scale leaves of G elata f. elata, and the bHLH62-3 expression level was significantly higher in all organs of G elata f. elata. ConclusionsFunctional pathway divergence of the bHLH family members has occurred across different chromosomes in G elata f. elata and G. elata f. glauca. Through synergism or antagonism with other genes, 21 bHLH genes participate in the coloration metabolic pathway regulation of stems, flowers, and fruits. Specifically, bHLH62-3 is involved in regulating stem color differentiation in the anthocyanin biosynthesis pathway of G. elata, thus relevant to the color formation of stem. Additionally, GebHLH48 positively regulates flowering-related pathways to promote the early-flowering phenotype of G. elata f. elata. These findings have laid the foundation for analyzing the genetic regulatory mechanisms underlying the color formation of the G. elata stem.
5.Transcriptome-based Mining of Genes Involved in Regulation of Cyclopeptide B Synthesis in Pseudostellaria heterophylla
Qingsu ZHOU ; Yishu HUANG ; Xiuwen WANG ; Jiao XU ; Xiaohong OU ; Hua HE ; Weike JIANG ; Tao ZHOU
Chinese Journal of Experimental Traditional Medical Formulae 2026;32(9):224-230
ObjectiveThe biosynthesis of heterophyllin B (HB), a cyclopeptide from Pseudostellaria heterophylla, is regulated by various abiotic stresses. Elucidating the transcriptional regulatory mechanism underlying HB biosynthesis is of great guiding significance for the directional improvement of P. heterophylla varieties and the enhancement of HB content. MethodsBased on transcriptome data from different tissues of P. heterophylla, transcription factors (TFs) specifically upregulated and highly expressed in the phloem of tuberous roots were screened through a combination of Mfuzz time-series clustering, transcription factor family prediction, and correlation analysis. Quantitative real-time polymerase chain reaction (Real-time PCR) was employed to analyze expression patterns of candidate TFs under abscisic acid (ABA) induction, and the dual-luciferase reporter assay was applied to verify their regulatory effects on HB precursor genes. ResultsContent determination showed that HB accumulated at the highest in the phloem of P. heterophylla tuberous roots (34 μg
6.LIU Shangyi's Experience in Differentiating and Treating Rectal Carcinoma Under the Theory of "Treating Ulcers as Tumors"
Wenqi HUANG ; Bing YANG ; Zhenming XIE ; Jinghui WANG ; Dingxue WANG ; Wenyu WU ; Dongxin TANG ;
Journal of Traditional Chinese Medicine 2026;67(7):716-719
This paper summarizes the experience of professor LIU Shangyi in differentiating and treating rectal carcinoma from the perspective of "treating ulcers as tumors". It is believed that the manifestations of rectal cancer, such as anal itching, cauliflower-like or ulcerative tumors, and bloody stools, are similar to external skin itching, skin ulceration, swelling, and skin bleeding. Therefore, the treatment principles of sores and ulcers department can be applied to treat tumors. Following the diagnostic and treatment approach of dermatology regarding the clinical typical symptoms, for anal itching, the main treatment is to dispel wind and remove dampness, clear heat to relieve itching, using "skin medicinals" such as Difuzi (Fructus Kochiae) and Baixianpi (Cortex Dictamni), as well as wind medicinals such as Shengma (Rhizoma Cimicifugae) and Fangfeng (Radix Saposhnikoviae). For constipation, the method of clearing heat and resolving toxins, unblocking the bowels and discharging heat can be used, commonly using Baitouweng (Radix Pulsatillae), Donglingcao (Herba Rabdosiae Rubescentis) and Dahuang (Radix et Rhizoma Rhei). In terms of mucosal ulcers, it is critical to differentiate between yin and yang; the treatment of yang ulcers should focus on clearing heat and resolving toxins, commonly using modified Xianfang Huoming Beverage (仙方活命饮); for yin ulcers, emphasis should be placed on removing dampness and resolving phlegm, commonly with modified Yiyi Fuzi Baijiang Powder (薏苡附子败酱散). For bloody stool, differentiation is made between deficiency and excess, with the use of Diyu (Radix Sanguisorbae) and Huaihua (Flos Sophorae) for excess syndrome to cool and stop blee-ding, and both herbs dry-fried until charred combined with liver-tonifying medicinals for deficiency syndrome
7.Protective effect and mechanism of chikusetsu saponin Ⅳa on the kidney in diabetic nephropathy rats
Yongli WANG ; Hai CHEN ; Xiaofang TIAN ; Xuechun WANG ; Liying YUAN ; Dan LIU ; Zhongfa LI ; Yanfang MENG ; Xiuyong YANG
China Pharmacy 2026;37(7):908-913
OBJECTIVE To study the protective effect and potential mechanism of chikusetsu saponin Ⅳ a (chsⅣ) on renal function in diabetic nephropathy (DN) model rats. METHODS DN rat model was established by high-fat diet combined with streptozotocin injection. Thirty-six model rats were randomly divided into model group (i.g. administration of normal saline, high-fat diet), chsⅣ low-dose and high-dose groups (i.g. administration of 90, 180 mg/kg chsⅣ, high-fat diet), with 12 rats in each group. Additionally, 10 normal rats were set as the control group (i.g. administration of normal saline, regular diet). From the 5th to the 12th week after streptozotocin injection, they were given intragastric administration of relevant drug or normal saline, once a day. After the last medication, the levels of fasting blood glucose, fasting insulin, blood urea nitrogen, serum creatinine and urine protein as well as the levels of reduced glutathione (GSH), superoxide dismutase (SOD) and malondialdehyde (MDA) in renal tissues were measured. Additionally, the insulin resistance index was calculated. Hematoxylin-eosin, periodic acid-Schiff, and Masson staining techniques were employed to examine the histopathological alterations in the renal tissue. The expressions of Notch signaling pathway-related proteins in renal tissue were detected by immunohistochemical staining and Western blot methods. RESULTS Compared with model group, the histomorphological of renal tissues in the chsⅣ low- and high-dose groups were significantly improved, with significant decreases in renal histological scores, mesangial expansion index, and glomerulosclerosis scores ( P <0.05); the levels of fasting blood glucose, fasting insulin, blood urea nitrogen, serum creatinine, urine protein and homeostasis model assessment for insulin resistance, as well as MDA content, the expression levels of Notch1, Notch intracellular domain, hairy and enhancer of Split 1 and Delta-like protein 1 in renal tissue were all significantly decreased ( P <0.05). The levels of GSH and SOD in renal tissue were significantly elevated ( P <0.05). Moreover, the improvement in these indicators was significantly more pronounced in the chsⅣ high-dose group compared to the chsⅣ low-dose group ( P <0.05). CONCLUSIONS ChsⅣ can ameliorate renal pathological damage and functional impairment in DN rats. Its underlying mechanisms include restoration of glucose homeostasis and insulin sensitivity, attenuation of renal oxidative stress, and suppression of aberrant Notch signaling pathway activation.
8.Analysis of comparator selection strategies for pharmaceutical enterprises in the national reimbursement drug list access application
Qingwen WANG ; Qin AN ; Xiaoyan YUAN ; Yuzhi HAN ; Xi CHEN ; Hongyan WU
China Pharmacy 2026;37(8):985-990
OBJECTIVE To analyze the selection and rationales of comparators for pharmaceutical enterprises in their medical insurance access application, so as to provide a reference for promoting communication and consensus between enterprises and medical insurance authorities in this process. METHODS The application materials for drugs outside the catalogue that passed formal review published by the National Healthcare Security Administration from 2021 to 2025 were extracted, and then content analysis was used to systematically sort out relevant information of the declared drugs and comparators; the specific situations and rationales of pharmaceutical enterprises’ selection of comparators were analyzed. RESULTS A total of 1 341 declared drug documents were collected. Data analysis showed that 1 035 (77.18%) were submitted with positive comparators and 306 (22.82%) used blank comparators; 58 drugs (4.33%) took combination therapy as the reference, and 5 drugs (0.37%) referred to non-pharmacological (or non-single pharmacological) treatment regimens. Among competitive drugs declared by multiple enterprises, 50.00% of the enterprises submitted different comparators. A total of 4 basic conditions and 39 additional conditions were extracted as the rationales for selecting positive comparators. For blank comparators, 12 drug-related factors, 2 administrative factors, and 1 other factor were identified. More than 10% of the drugs did not state the rationale for comparator selection, and over 44% of drugs using blank comparators provided only one justification. CONCLUSIONS Pharmaceutical enterprises mainly select comparators based on their own interests in the medical insurance access application, and there are deficiencies in the adequacy and standardization of their selection basis and reasoning. It is recommended that enterprises follow the principled requirements of medical insurance authorities, and fully and normatively explain the reasons for selecting comparators in combination with the characteristics of their own products. Meanwhile, it is advisable to change the current open-ended statement form of selection reasons into a closed-ended answering mode, so as to highlight the priority of selection, standardize the declaration behavior of enterprises, and reduce communication divergences between the two parties.
9.Assessing High-density Y-SNP Panels for Paternal Haplogroup Assignment in Forensic Practice
De-Qin ZHANG ; Chun-Nian WANG ; Lin-Lin LOU ; Meng NI ; Jing GAO ; Jiang HUANG ; Li JIANG
Progress in Biochemistry and Biophysics 2026;53(2):458-469
ObjectiveThe accuracy of Y-chromosome haplogroup assignment is crucial for tracing paternal lineage in male samples. With the advancement of high-throughput sequencing technologies, high-density Y-SNP genotyping from whole-genome or array-based data has become a standard method for determiningY-chromosome haplogroups. This study systematically evaluated the performance of 4 commonly used high-density SNP genotyping systems—namely, the Global Screening Array (GSA), Chinese Genotyping Array (CGA), Affymetrix array, and the 1240K capture panel—for haplogroup assignment. This work provides a reference for data comparison across different systems. MethodsWe extracted genotype data for the 4 Y-SNP panels from 30× whole-genome sequencing (WGS) data of 1 590 male samples from the 1000 Genomes Project. Additionally, GSA array genotype data from 384 relative pairs (spanning 1st- to 12th-degree relationships) from 109 Chinese Han families were collected. Haplogroup assignment was performed using Y-LineageTracker v1.3.0 software. We assessed the concordance and resolution of haplogroup assignments between the four Y-SNP panels and the WGS data. The consistency and resolution of haplogroup assignments were also evaluated for both the 1000 Genomes Project samples and the 109 family samples collected in this study. Furthermore, the impact of varying numbers of Y-SNPs on haplogroup assignment was examined. ResultsThe GSA and CGA panels demonstrated superior resolution and discrimination of haplogroup subclades compared with the other two panels. The haplogroup assignments from the GSA, CGA, and 1240K panels showed high concordance with WGS data, with consistency rates exceeding 88.70%, whereas the Affymetrix platform exhibited a significantly lower consistency rate of 61.89%. Specifically, the GSA and CGA panels consistently demonstrated superior performance compared with the other two panels in the assignment of haplogroups O-M175 and H-L901, achieving complete concordance (100%) for both haplogroups. In contrast, the Affymetrix panel erroneously assigned all individuals belonging to haplogroup O-M175 to haplogroup K2-M526. Furthermore, its accuracy for haplogroup H-L901 was exceedingly low, at merely 1.41%. This poor performance was characterized by the misassignment of 98.59% of H-L901 samples—specifically, 1.41% to J-M304 and a predominant 97.18% to F-M89. For haplogroup R-M207, all four panels exhibited uniformly high levels of consistency, with concordance values exceeding 94.00%. Notably, for haplogroup E-M96, the 1240K and Affymetrix panels outperformed the GSA and CGA panels in terms of concordance, representing the first instance in which these two panels surpassed the latter. Conversely, for haplogroups J-M304, Q-M242, and I-M170, all 4 panels showed relatively elevated misclassification rates, with the Affymetrix array demonstrating the poorest overall performance. None of the four panels showed any discordant haplogroup assignments among the familial relative pairs analyzed. A positive correlation was observed between the number of Y-SNPs (ranging from 1 000 to 10 000) and classification consistency; however, classification consistency plateaued when the number of Y-SNPs exceeded 10 000. Furthermore, a random sampling analysis conducted on the GSA and CGA panels demonstrated that the haplogroup misclassification rate exhibited negligible fluctuation across the Y-SNP range of 500 to 1 000. Conversely, a marked enhancement in classification consistency was observed as the number of markers increased from 1 000 to 5 000, ultimately reaching a plateau within the interval of 5 000 to 8 000 markers. ConclusionThese findings indicate that the GSA and CGA panels provide high resolution and concordance, delivering reliable Y-haplogroup assignment for forensic investigations.
10.Assessing High-density Y-SNP Panels for Paternal Haplogroup Assignment in Forensic Practice
De-Qin ZHANG ; Chun-Nian WANG ; Lin-Lin LOU ; Meng NI ; Jing GAO ; Jiang HUANG ; Li JIANG
Progress in Biochemistry and Biophysics 2026;53(2):458-469
ObjectiveThe accuracy of Y-chromosome haplogroup assignment is crucial for tracing paternal lineage in male samples. With the advancement of high-throughput sequencing technologies, high-density Y-SNP genotyping from whole-genome or array-based data has become a standard method for determiningY-chromosome haplogroups. This study systematically evaluated the performance of 4 commonly used high-density SNP genotyping systems—namely, the Global Screening Array (GSA), Chinese Genotyping Array (CGA), Affymetrix array, and the 1240K capture panel—for haplogroup assignment. This work provides a reference for data comparison across different systems. MethodsWe extracted genotype data for the 4 Y-SNP panels from 30× whole-genome sequencing (WGS) data of 1 590 male samples from the 1000 Genomes Project. Additionally, GSA array genotype data from 384 relative pairs (spanning 1st- to 12th-degree relationships) from 109 Chinese Han families were collected. Haplogroup assignment was performed using Y-LineageTracker v1.3.0 software. We assessed the concordance and resolution of haplogroup assignments between the four Y-SNP panels and the WGS data. The consistency and resolution of haplogroup assignments were also evaluated for both the 1000 Genomes Project samples and the 109 family samples collected in this study. Furthermore, the impact of varying numbers of Y-SNPs on haplogroup assignment was examined. ResultsThe GSA and CGA panels demonstrated superior resolution and discrimination of haplogroup subclades compared with the other two panels. The haplogroup assignments from the GSA, CGA, and 1240K panels showed high concordance with WGS data, with consistency rates exceeding 88.70%, whereas the Affymetrix platform exhibited a significantly lower consistency rate of 61.89%. Specifically, the GSA and CGA panels consistently demonstrated superior performance compared with the other two panels in the assignment of haplogroups O-M175 and H-L901, achieving complete concordance (100%) for both haplogroups. In contrast, the Affymetrix panel erroneously assigned all individuals belonging to haplogroup O-M175 to haplogroup K2-M526. Furthermore, its accuracy for haplogroup H-L901 was exceedingly low, at merely 1.41%. This poor performance was characterized by the misassignment of 98.59% of H-L901 samples—specifically, 1.41% to J-M304 and a predominant 97.18% to F-M89. For haplogroup R-M207, all four panels exhibited uniformly high levels of consistency, with concordance values exceeding 94.00%. Notably, for haplogroup E-M96, the 1240K and Affymetrix panels outperformed the GSA and CGA panels in terms of concordance, representing the first instance in which these two panels surpassed the latter. Conversely, for haplogroups J-M304, Q-M242, and I-M170, all 4 panels showed relatively elevated misclassification rates, with the Affymetrix array demonstrating the poorest overall performance. None of the four panels showed any discordant haplogroup assignments among the familial relative pairs analyzed. A positive correlation was observed between the number of Y-SNPs (ranging from 1 000 to 10 000) and classification consistency; however, classification consistency plateaued when the number of Y-SNPs exceeded 10 000. Furthermore, a random sampling analysis conducted on the GSA and CGA panels demonstrated that the haplogroup misclassification rate exhibited negligible fluctuation across the Y-SNP range of 500 to 1 000. Conversely, a marked enhancement in classification consistency was observed as the number of markers increased from 1 000 to 5 000, ultimately reaching a plateau within the interval of 5 000 to 8 000 markers. ConclusionThese findings indicate that the GSA and CGA panels provide high resolution and concordance, delivering reliable Y-haplogroup assignment for forensic investigations.

Result Analysis
Print
Save
E-mail