- VernacularTitle:高密度Y-SNP分型体系用于法医实践中Y单倍群推断的效能评估
- Author:
De-Qin ZHANG
1
;
Chun-Nian WANG
2
;
Lin-Lin LOU
2
;
Meng NI
3
;
Jing GAO
2
;
Jiang HUANG
1
;
Li JIANG
2
Author Information
- Publication Type:Journal Article
- Keywords: Y-SNP; NRY haplogroup classification; paternal genealogy; ancestry inference
- From: Progress in Biochemistry and Biophysics 2026;53(2):458-469
- CountryChina
- Language:Chinese
- Abstract: ObjectiveThe accuracy of Y-chromosome haplogroup assignment is crucial for tracing paternal lineage in male samples. With the advancement of high-throughput sequencing technologies, high-density Y-SNP genotyping from whole-genome or array-based data has become a standard method for determiningY-chromosome haplogroups. This study systematically evaluated the performance of 4 commonly used high-density SNP genotyping systems—namely, the Global Screening Array (GSA), Chinese Genotyping Array (CGA), Affymetrix array, and the 1240K capture panel—for haplogroup assignment. This work provides a reference for data comparison across different systems. MethodsWe extracted genotype data for the 4 Y-SNP panels from 30× whole-genome sequencing (WGS) data of 1 590 male samples from the 1000 Genomes Project. Additionally, GSA array genotype data from 384 relative pairs (spanning 1st- to 12th-degree relationships) from 109 Chinese Han families were collected. Haplogroup assignment was performed using Y-LineageTracker v1.3.0 software. We assessed the concordance and resolution of haplogroup assignments between the four Y-SNP panels and the WGS data. The consistency and resolution of haplogroup assignments were also evaluated for both the 1000 Genomes Project samples and the 109 family samples collected in this study. Furthermore, the impact of varying numbers of Y-SNPs on haplogroup assignment was examined. ResultsThe GSA and CGA panels demonstrated superior resolution and discrimination of haplogroup subclades compared with the other two panels. The haplogroup assignments from the GSA, CGA, and 1240K panels showed high concordance with WGS data, with consistency rates exceeding 88.70%, whereas the Affymetrix platform exhibited a significantly lower consistency rate of 61.89%. Specifically, the GSA and CGA panels consistently demonstrated superior performance compared with the other two panels in the assignment of haplogroups O-M175 and H-L901, achieving complete concordance (100%) for both haplogroups. In contrast, the Affymetrix panel erroneously assigned all individuals belonging to haplogroup O-M175 to haplogroup K2-M526. Furthermore, its accuracy for haplogroup H-L901 was exceedingly low, at merely 1.41%. This poor performance was characterized by the misassignment of 98.59% of H-L901 samples—specifically, 1.41% to J-M304 and a predominant 97.18% to F-M89. For haplogroup R-M207, all four panels exhibited uniformly high levels of consistency, with concordance values exceeding 94.00%. Notably, for haplogroup E-M96, the 1240K and Affymetrix panels outperformed the GSA and CGA panels in terms of concordance, representing the first instance in which these two panels surpassed the latter. Conversely, for haplogroups J-M304, Q-M242, and I-M170, all 4 panels showed relatively elevated misclassification rates, with the Affymetrix array demonstrating the poorest overall performance. None of the four panels showed any discordant haplogroup assignments among the familial relative pairs analyzed. A positive correlation was observed between the number of Y-SNPs (ranging from 1 000 to 10 000) and classification consistency; however, classification consistency plateaued when the number of Y-SNPs exceeded 10 000. Furthermore, a random sampling analysis conducted on the GSA and CGA panels demonstrated that the haplogroup misclassification rate exhibited negligible fluctuation across the Y-SNP range of 500 to 1 000. Conversely, a marked enhancement in classification consistency was observed as the number of markers increased from 1 000 to 5 000, ultimately reaching a plateau within the interval of 5 000 to 8 000 markers. ConclusionThese findings indicate that the GSA and CGA panels provide high resolution and concordance, delivering reliable Y-haplogroup assignment for forensic investigations.

