ADASYN and Category Inverse Proportion Weighting Method to Imbalanced Data of Alzheimer's Disease
10.11783/j.issn.1002-3674.2024.02.003
- VernacularTitle:ADASYN与类别逆比例加权法在阿尔茨海默病不平衡数据中的应用
- Author:
Hui YANG
1
;
Fuliang YI
;
Durong CHEN
Author Information
1. 山西医科大学公共卫生学院卫生统计教研室(030000)
- Keywords:
Category imbalance;
Adaptive synthetic sampling;
Weighting method;
Alzheimer's disease;
Classification
- From:
Chinese Journal of Health Statistics
2024;41(2):175-180
- CountryChina
- Language:Chinese
-
Abstract:
Objective The adaptive synthetic sampling(ADASYN)algorithm and category inverse proportion weighting method weighting method were used to balance the datasets,then multi-classification prediction of cognitive normal(CN),mild cognitive impairment(MCI),and Alzheimer's disease(AD)combined with classifiers were performed.Methods Data were obtained from the Alzheimer's Disease Neuroimaging Initiative(ADNI)database,which was filled in missing values by random forest(RF),and feature subsets were selected by elastic net(EN).We chose ADASYN algorithm and category inverse proportion weighting method processing the category imbalance data,and four models were constructed by combining RF and support vector machine(SVM)respectively:ADASYN-RF,ADASYN-SVM,weighted random forest(WRF),and weighted support vector machine(WSVM).We evaluated the classification performance by macro-P,macro-R,macro-F1,ACC,Kappa value and area under the receiver operating characteristics curve(AUC).Results ADASYN-RF had the best classification performance(Kappa=0.938,AUC=0.980),followed by ADASYN-SVM.The most important classification features obtained using ADASYN-RF were CDRSB,LDELTOTAL,and MMSE,which have been clinically validated.Conclusions Both the ADASYN algorithm and the category inverse proportion weighting method can assist in improving classifier performance,and the ADASYN algorithm is superior.