1.Novel strategies to identify relevant molecular signatures for complex human diseases based on data of identical-by-decent profiles and genomic context
Chuanxing LI ; Lei DU ; Xia LI ; Binsheng GONG ; Jie ZHANG ; Shaoqi RAO
Journal of Peking University(Health Sciences) 2006;38(1):74-77
Objective: To develop novel strategies to identify relevant molecular signatures for complex human diseases based on data of identical-by-decent profiles and genomic context.Methods: In the proposed strategies, we define four relevancy criteria for mapping SNP-phenotype relationships-point-wise IBD mean difference, averaged IBD difference for window, Z curve and averaged slope for window.Results: Application of these criteria and permutation test to 100 simulated replicates for two hypothetical American populations to extract the relevant SNPs for alcoholism based on sib-pair IBD profiles of pedigrees demonstrates that the proposed strategies have successfully identified most of the simulated true loci.Conclusion: The data mining practice implies that IBD statistic and genomic context could be used as the informatics for locating the underlying genes for complex human diseases. Compared with the classical Haseman-Elston sib-pair regression method, the proposed strategies are more efficient for large-scale genomic mining.
2.Decision forest analysis of large-scale sib-pair identical-by-decent profiles for locating the underlying disease genes for alcoholism in human
Xia LI ; Shaoqi RAO ; Wei ZHANG ; Zheng GUO ; Wei JIANG ; Lei DU
Journal of Peking University(Health Sciences) 2006;38(1):71-73
Objective: To extract the relevant SNPs for alcoholism using sib-pair IBD profiles of pedigrees.Methods: We used the ensemble decision approach, a supervised learning approach based on decision forests, to locate alcoholism relevant SNPs using genome-wide SNP data. Results: Application to a publicly available large dataset of 100 simulated replicates for three American populations (http://www.gaworkshop.org/) demonstrates that the proposed approach has successfully located all of the simulated true loci.Conclusion: The numerical results establish the proposed decision forest analysis to be a powerful and practical alternative for large-scale family-based association study.