1.An ensemble-based likelihood ratio approach for family-based genomic risk prediction.
Hui AN ; Chang-Shuai WEI ; Oliver WANG ; Da-Hui WANG ; Liang-Wen XU ; Qing LU ; Cheng-Yin YE
Journal of Zhejiang University. Science. B 2018;19(12):935-947
OBJECTIVE:
As one of the most popular designs used in genetic research, family-based design has been well recognized for its advantages, such as robustness against population stratification and admixture. With vast amounts of genetic data collected from family-based studies, there is a great interest in studying the role of genetic markers from the aspect of risk prediction. This study aims to develop a new statistical approach for family-based risk prediction analysis with an improved prediction accuracy compared with existing methods based on family history.
METHODS:
In this study, we propose an ensemble-based likelihood ratio (ELR) approach, Fam-ELR, for family-based genomic risk prediction. Fam-ELR incorporates a clustered receiver operating characteristic (ROC) curve method to consider correlations among family samples, and uses a computationally efficient tree-assembling procedure for variable selection and model building.
RESULTS:
Through simulations, Fam-ELR shows its robustness in various underlying disease models and pedigree structures, and attains better performance than two existing family-based risk prediction methods. In a real-data application to a family-based genome-wide dataset of conduct disorder, Fam-ELR demonstrates its ability to integrate potential risk predictors and interactions into the model for improved accuracy, especially on a genome-wide level.
CONCLUSIONS
By comparing existing approaches, such as genetic risk-score approach, Fam-ELR has the capacity of incorporating genetic variants with small or moderate marginal effects and their interactions into an improved risk prediction model. Therefore, it is a robust and useful approach for high-dimensional family-based risk prediction, especially on complex disease with unknown or less known disease etiology.
Area Under Curve
;
Computer Simulation
;
Conduct Disorder/physiopathology*
;
Family Health
;
Female
;
Genetic Markers
;
Genetic Predisposition to Disease
;
Genetic Variation
;
Genome, Human
;
Genome-Wide Association Study
;
Genomics
;
Humans
;
Likelihood Functions
;
Male
;
Models, Genetic
;
Odds Ratio
;
Pedigree
;
ROC Curve
;
Reproducibility of Results
;
Risk Factors