1.A modified T-test feature selection method and its application on the HapMap genotype data.
Genomics, Proteomics & Bioinformatics 2007;5(3-4):242-249
Single nucleotide polymorphisms (SNPs) are genetic variations that determine the differences between any two unrelated individuals. Various population groups can be distinguished from each other using SNPs. For instance, the HapMap dataset has four population groups with about ten million SNPs. For more insights on human evolution, ethnic variation, and population assignment, we propose to find out which SNPs are significant in determining the population groups and then to classify different populations using these relevant SNPs as input features. In this study, we developed a modified t-test ranking measure and applied it to the HapMap genotype data. Firstly, we rank all SNPs in comparison with other feature importance measures including F-statistics and the informativeness for assignment. Secondly, we select different numbers of the most highly ranked SNPs as the input to a classifier, such as the support vector machine, so as to find the best feature subset corresponding to the best classification accuracy. Experimental results showed that the proposed method is very effective in finding SNPs that are significant in determining the population groups, with reduced computational burden and better classification accuracy.
Algorithms
;
Computational Biology
;
Databases, Nucleic Acid
;
Genetics, Medical
;
statistics & numerical data
;
Genetics, Population
;
Genomics
;
statistics & numerical data
;
Genotype
;
Humans
;
Polymorphism, Single Nucleotide
2.Develop a statistics analysis software in population genetics using VBA language.
Ying CAI ; Ni ZHOU ; Ye-li XU ; Da-peng XIANG ; Jiang-hui SU ; Lin-tian ZHANG
Journal of Forensic Medicine 2006;22(6):417-420
OBJECTIVE:
To develop a statistics analysis software that can be used in STR population genetics for the purpose of promoting and fastening the basic research of STR population genetics.
METHODS:
Selecting the Microsoft VBA for Excel, which is simple and easy to use, as the program language and using its macro function to develop a statistics analysis software used in STR population genetics.
RESULTS:
The software "Easy STR Genetics" based on VBA language, by which the population genetic analysis of STR data can be made, were developed.
CONCLUSION
The developed software "Easy STR Genetics" based on VBA language, can be spread in the domain of STR population genetics research domestically and internationally, due to its feature of full function, good compatibility for different formats of input data, distinct and easy to understand outputs for statistics and calculation results.
Algorithms
;
Electronic Data Processing
;
Genetics, Population/statistics & numerical data*
;
Humans
;
Microsatellite Repeats
;
Quality Control
;
Software
;
Software Design
3.Analysis of HPA1-16 and HLA-A, B gene polymorphisms among ethnic Han population from Shandong.
Yi ZHANG ; Yuan YU ; Wenben QIAO ; Yan LIU ; Juan ZHOU ; Jianhong XU ; Bing FAN ; Liyue JIANG ; Wenhua LIANG ; Chuanfu ZHU
Chinese Journal of Medical Genetics 2016;33(5):690-693
OBJECTIVETo study the polymorphisms of human platelet antigen (HPA) 1-16 and human leukocyte antigen (HLA)-A and -B loci among ethnic Han population from Shandong.
METHODSA total of 588 samples from platelet donors were genotyped for the above loci with sequence-specific primer PCR and sequence-specific oligonucleotide probe PCR.
RESULTSThe frequencies of HPA-la, -1b, HPA-2a, -2b, HPA-3a, -3b, HPA-4a, -4b, HPA-5a, -5b, HPA-6a, -6b, HPA-15a, -15b were 0.9974, 0.0026, 0.9456, 0.0544, 0.5417, 0.4583, 0.9983, 0.0017, 0.9889, 0.0111, 0.9903, 0.0097, 0.5434 and 0.4583, respectively. The HPA-7-14 and HPA-16 showed no heterozygosity as the b allele was not detected in such loci. The most common genotypic combination for HPA was HPA-(1,4,7-14,16,17) aa-2aa-3ab-5aa -6aa-15ab (0.1820). HLA-A2 (0.3070) and HLA-B13 (0.1361) demonstrated the highest frequencies at their respective loci.
CONCLUSIONThe HPA and HLA loci are highly polymorphic among ethnic Hans from Shandong. The distribution of HPA polymorphisms also shows a great ethnic and territorial difference. It is important to construct regional database for the genotypes of HPA and HLA loci for platelet donors.
Alleles ; Antigens, Human Platelet ; genetics ; Asian Continental Ancestry Group ; genetics ; statistics & numerical data ; Blood Donors ; China ; Female ; Gene Frequency ; Genetics, Population ; Genotype ; HLA-A Antigens ; genetics ; HLA-B Antigens ; genetics ; Humans ; Linkage Disequilibrium ; Male ; Polymorphism, Genetic
4.Polymorphisms in CYP2R1 Gene Associated with Serum Vitamin D Levels and Status in a Chinese Rural Population.
Yan WANG ; Han HAN ; Jun WANG ; Fang SHEN ; Fei YU ; Ling WANG ; Song Cheng YU ; Dong Dong ZHANG ; Hua Lei SUN ; Yuan XUE ; Yue BA ; Chong Jian WANG ; Wen Jie LI
Biomedical and Environmental Sciences 2019;32(7):550-553
6.HPV infection among Uygur women in a rural area of Hetian Prefecture, Xinjiang Uygur Autonomous Region, China.
Sulaiya HUSAIYIN ; Mayinuer NIYAZI ; Li hong WANG ; Jun Jie WANG ; Jian Bing WANG ; Ayeti SIMAYI ; Lin WANG ; Zumurelaiti AINIWAER ; Chun Hua MA ; Jennifer S SMITH
Biomedical and Environmental Sciences 2013;26(11):934-936
Adult
;
Age Factors
;
China
;
epidemiology
;
Female
;
Human Papillomavirus DNA Tests
;
Humans
;
Middle Aged
;
Papillomaviridae
;
genetics
;
isolation & purification
;
Papillomavirus Infections
;
epidemiology
;
pathology
;
virology
;
Prevalence
;
Rural Population
;
statistics & numerical data
;
Uterine Cervical Neoplasms
;
epidemiology
;
pathology
;
virology
;
Young Adult