Application of Feature Selection-based Ensemble Learning to Predict Mild Cognitive Impairment in Chinese Elderly
10.11783/j.issn.1002-3674.2025.05.012
- VernacularTitle:基于特征选择的集成学习方法在预测中国老年人轻度认知障碍中的应用
- Author:
Yaning SUN
1
;
Hengchuan ZHANG
;
Yinyin CHEN
Author Information
1. 安徽医科大学公共卫生学院(230032)
- Publication Type:Journal Article
- Keywords:
Mild cognitive impairment;
Machine learning;
Feature selection;
Stacking ensemble models
- From:
Chinese Journal of Health Statistics
2025;42(5):705-712
- CountryChina
- Language:Chinese
-
Abstract:
Objective To construct a risk prediction model of mild cognitive impairment(MCI)in the Chinese elderly population based on ensemble learning methods and verify it,to intervene in time and delay the progression of MCI.Methods A total of 8691 elderly people in the Chinese longitudinal health longevity study(CLHLS)from 2008 to 2018 were selected as the research objects,and the data from 2008 to 2014 were used as the training set.Data from 2014 to 2018 as a validation set.The Chinese version of the mini-mental state examination(CMMSE)was used to assess the cognitive status of the participants.Recursive feature elimination-random forest(RFE-RF),Boruta,mutual information(MI),and extra trees classifier(ETC)identified the predictors and screened out the common predictors.Use logistic regression(LR),random forest(RF),linear discriminant analysis(LDA),K-nearest neighbors(KNN),and na?ve bayes(NB)as the five single basic models,and a stacking ensemble model that integrates these basic models to predict the risk of mild cognitive impairment in elderly Chinese.Accuracy,precision,recall,and F1-score,as well as the area under the receiver operating characteristic(AUROC)and the area under the precision-recall curve(AUPRC),were used to evaluate the performance of the models.Results The performance of the stacking ensemble model under different feature selection algorithms is superior to any single base model,with AUROC greater than 0.9 in all cases.The feature selection algorithm ETC+stacking ensemble model performs best,and the AUROC and AUPRC in the test set are 0.912 and 0.872,respectively.Conclusion The stacking model shows superior performance in predicting MCI.This provides strong support for the strategy of healthy aging in China by timely identifying the high-risk groups of MCI,reducing the heavy burden of MCI brought by the elderly.