- Author:
Hui AN
1
;
Chang-Shuai WEI
2
;
Oliver WANG
3
;
Da-Hui WANG
1
;
Liang-Wen XU
4
;
Qing LU
5
;
Cheng-Yin YE
1
Author Information
- Publication Type:Journal Article
- Keywords: Family-based study; Genetic risk prediction; High-dimensional data
- MeSH: Area Under Curve; Computer Simulation; Conduct Disorder/physiopathology*; Family Health; Female; Genetic Markers; Genetic Predisposition to Disease; Genetic Variation; Genome, Human; Genome-Wide Association Study; Genomics; Humans; Likelihood Functions; Male; Models, Genetic; Odds Ratio; Pedigree; ROC Curve; Reproducibility of Results; Risk Factors
- From: Journal of Zhejiang University. Science. B 2018;19(12):935-947
- CountryChina
- Language:English
-
Abstract:
OBJECTIVE:As one of the most popular designs used in genetic research, family-based design has been well recognized for its advantages, such as robustness against population stratification and admixture. With vast amounts of genetic data collected from family-based studies, there is a great interest in studying the role of genetic markers from the aspect of risk prediction. This study aims to develop a new statistical approach for family-based risk prediction analysis with an improved prediction accuracy compared with existing methods based on family history.
METHODS:In this study, we propose an ensemble-based likelihood ratio (ELR) approach, Fam-ELR, for family-based genomic risk prediction. Fam-ELR incorporates a clustered receiver operating characteristic (ROC) curve method to consider correlations among family samples, and uses a computationally efficient tree-assembling procedure for variable selection and model building.
RESULTS:Through simulations, Fam-ELR shows its robustness in various underlying disease models and pedigree structures, and attains better performance than two existing family-based risk prediction methods. In a real-data application to a family-based genome-wide dataset of conduct disorder, Fam-ELR demonstrates its ability to integrate potential risk predictors and interactions into the model for improved accuracy, especially on a genome-wide level.
CONCLUSIONS:By comparing existing approaches, such as genetic risk-score approach, Fam-ELR has the capacity of incorporating genetic variants with small or moderate marginal effects and their interactions into an improved risk prediction model. Therefore, it is a robust and useful approach for high-dimensional family-based risk prediction, especially on complex disease with unknown or less known disease etiology.