1.Heart Alert: A heart disease prediction system using machine learning approach and optimization techniques
Justin Allen P. Denopol ; Ma. Sheila A. Magboo ; Vincent Peter C. Magboo
Philippine Journal of Health Research and Development 2022;26(3):83-92
Background:
Cardiovascular diseases belong to the top three leading causes of mortality in the Philippines with 17.8 % of the total deaths. Lifestyle-related habits such as alcohol consumption, smoking, poor diet and nutrition, high sedentary behavior, overweight, and obesity have been increasingly implicated in the high rates of heart disease among Filipinos leading to a significant burden to the country's healthcare system. The objective of this study was to predict the presence of heart disease using various machine learning algorithms (support vector machine, naïve Bayes, random forest, logistic regression, decision tree, and adaptive boosting) evaluated on an anonymized publicly available cardiovascular disease dataset.
Methodology:
Various machine learning algorithms were applied on an anonymized publicly available
cardiovascular dataset from a machine learning data repository (IEEE Dataport). A web-based application
system named Heart Alert was developed based on the best machine learning model that would predict the risk of developing heart disease. An assessment of the effects of different optimization techniques as to the imputation methods (mean, median, mode, and multiple imputation by chained equations) and as to the feature selection method (recursive feature elimination) on the classification performance of the machine learning algorithms was made. All simulation experiments were implemented via Python 3.8 and its machine learning libraries (Scikit-learn, Keras, Tensorflow, Pandas, Matplotlib, Seaborn, NumPy).
Results:
The support vector machine without imputation and feature selection obtained the highest
performance metrics (90.2% accuracy, 87.7% sensitivity, 93.6% specificity, 94.9% precision, 91.2% F1-score and an area under the receiver operating characteristic curve of 0.902 ) and was used to implement the heart disease prediction system (Heart Alert). Following very closely were random forest with mean or median imputation and logistic regression with mode imputation, all having no feature selection which also performed well.
Conclusion
The performance of the best four machine learning models suggests that for this dataset,
imputation technique for missing values may or may not be done. Likewise, recursive feature elimination for feature selection may not apply as all variables seem to be important in heart disease prediction. An early accurate diagnosis leading to prompt intervention efforts is very crucial as it improves the patient's quality of life and diminishes the risk of developing cardiac events.
Machine Learning
;
Support Vector Machine
2.Population Pharmacokinetic and Pharmacodynamic Models of Propofol in Healthy Volunteers using NONMEM and Machine Learning Methods.
Yoo Mi KIM ; Sung Hong KANG ; Il Su PARK ; Gyu Jeong NOH
Journal of Korean Society of Medical Informatics 2008;14(2):147-159
OBJECTIVES: The primary objective of this study is to compare model performance of machine learning methods with that of a previous study in which a nonlinear mixed effects model was created using NONMEM(R) for the pharmacokinetic and pharmacodynamic data for propofol. The secondary objective was to evaluate if a pharmacodynamic model describing the relationship between the dose of propofol and bispectral index (BIS) outperform that describing the relationship between a pharmacokinetic model derived-predicted concentrations of propofol and BIS. METHODS: Data were collected during a study involving the infusion of propofol into healthy volunteers. Pharmacokinetic and pharmacodynamic models were constructed using artificial neural networks (ANNs), support vector machines (SVMs), and multi-method ensembles and were compared with the nonlinear mixed effects method as implemented by NONMEM(R). Model performance was assessed by goodness-of-fit statistics, paired t-tests between predicted and observed values for each model and scatterplots. RESULTS: In pharmacokinetic analysis, ensemble I, the mean of ANN and NONMEM(R) predictions, achieved minimal error and the highest correlation coefficient. SVM produced the highest error and the lowest correlation coefficient. In pharmacodynamic analysis, ANN exhibited the best performance. An ANNModel describing the relationship between the dose of propofol and BIS was not inferior to an ANN model describing the relationship between predicted concentrations of propofol derived from an ANN pharmacokinetic model and BIS. CONCLUSIONS: In pharmacokinetic analysis, ensemble combined with ANN achieved slightly better performance than NONMEM(R). The relationship between the dose of propofol and BIS can be predicted without considering pharmacokinetics of propofol.
Machine Learning
;
Propofol
;
Support Vector Machine
3.MicroRNA Target Prediction Based on Support Vector Machine Ensemble Classification Algorithm of Under-sampling Technique.
Journal of Biomedical Engineering 2016;33(1):72-77
Considering the low accuracy of prediction in the positive samples and poor overall classification effects caused by unbalanced sample data of MicroRNA (miRNA) target, we proposes a support vector machine (SVM)-integration of under-sampling and weight (IUSM) algorithm in this paper, an under-sampling based on the ensemble learning algorithm. The algorithm adopts SVM as learning algorithm and AdaBoost as integration framework, and embeds clustering-based under-sampling into the iterative process, aiming at reducing the degree of unbalanced distribution of positive and negative samples. Meanwhile, in the process of adaptive weight adjustment of the samples, the SVM-IUSM algorithm eliminates the abnormal ones in negative samples with robust sample weights smoothing mechanism so as to avoid over-learning. Finally, the prediction of miRNA target integrated classifier is achieved with the combination of multiple weak classifiers through the voting mechanism. The experiment revealed that the SVM-IUSW, compared with other algorithms on unbalanced dataset collection, could not only improve the accuracy of positive targets and the overall effect of classification, but also enhance the generalization ability of miRNA target classifier.
Algorithms
;
MicroRNAs
;
chemistry
;
Support Vector Machine
4.MicroRNA target predicition based on SVM and the optimized feature set.
Baowen WANG ; Xiaoyang QI ; Changwu WANG ; Wenyuan LIU ; Yali SI
Journal of Biomedical Engineering 2013;30(6):1213-1218
MicroRNA (miRNA) is a family of endogenous single-stranded RNA about 22 nucleotides in length. Through targeting 3' UTR of message RNA (mRNA), they play important roles in post-transcriptional regulatory functions. For further research of miRNA function, the identification of more miRNA positive targets is needed urgently. Aiming at the high-dimensional small sample data sets in miRNA target prediction, an algorithm of eliminating redundant features is proposed based on v-SVM in this paper, and classification and features selection are also fused. The algorithm of eliminating redundant features optimizes the combination of features, and then constructs the best features combination which can represent miRNA and targets interaction model. The prior parameter v (0 < u < or = 1) controls the compression proportion of data set and selects more distinguishing support vectors. Finally, the classifier model of miRNA target prediction is built. The unbiased assessment of the classifier is achieved with a completely independent test dataset. Experiment results indicated that in both classification recognition and generalization performance of miRNA targets predicition, this model was superior to the present machine learning algorithms such as miTarget, NBmiRTar and TargetMiner, etc.
MicroRNAs
;
Models, Theoretical
;
Support Vector Machine
5.Classification Model of Corneal Opacity Based on Digital Image Features.
Peng LUO ; Jilong ZHENG ; Peng ZHOU ; Yongde ZHANG ; Shijie CHANG ; Xianzheng SHA
Chinese Journal of Medical Instrumentation 2021;45(4):361-365
OBJECTIVE:
According to the digital image features of corneal opacity, a multi classification model of support vector machine (SVM) was established to explore the objective quantification method of corneal opacity.
METHODS:
The cornea digital images of dead pigs were collected, part of the color features and texture features were extracted according to the previous experience, and the SVM multi classification model was established. The test results of the model were evaluated by precision, sensitivity and
RESULTS:
In the classification of corneal opacity, the highest
CONCLUSIONS
The SVM multi classification model can classify the degree of corneal opacity.
Animals
;
Corneal Opacity
;
Support Vector Machine
;
Swine
6.Hierarchical Classification of ECG Beat Using Higher Order Statistics and Hermite Model.
Kwan Soo PARK ; Baek Hwan CHO ; Do Hoon LEE ; Su Hwa SONG ; Jong Shill LEE ; Young Joon CHEE ; In Young KIM ; Sun I KIM
Journal of Korean Society of Medical Informatics 2009;15(1):117-131
OBJECTIVE: The heartbeat classification of the electrocardiogram is important in cardiac disease diagnosis. For detecting QRS complex, conventional detection algorithmhave been designed to detect P, QRS, Twave, first. However, the detection of the P and T wave is difficult because their amplitudes are relatively low, and occasionally they are included in noise. Furthermore the conventionalmulticlass classificationmethodmay have skewed results to themajority class, because of unbalanced data distribution. METHODS: The Hermite model of the higher order statistics is good characterization methods for recognizing morphological QRS complex. We applied three morphological feature extraction methods for detecting QRS complex: higher-order statistics, Hermite basis functions andHermitemodel of the higher order statistics.Hierarchical scheme tackle the unbalanced data distribution problem. We also employed a hierarchical classification method using support vector machines. RESULTS:We compared classification methods with feature extraction methods. As a result, our mean values of sensitivity for hierarchical classification method (75.47%, 76.16% and 81.21%) give better performance than the conventionalmulticlass classificationmethod (46.16%). In addition, theHermitemodel of the higher order statistics gave the best results compared to the higher order statistics and the Hermite basis functions in the hierarchical classification method. CONCLUSION: This research suggests that the Hermite model of the higher order statistics is feasible for heartbeat feature extraction. The hierarchical classification is also feasible for heartbeat classification tasks that have the unbalanced data distribution.
Classification*
;
Diagnosis
;
Electrocardiography*
;
Heart Diseases
;
Noise
;
Support Vector Machine
7.Detection of Neural Fates from Random Differentiation: Application of Support Vector MachineMin.
Min Su LEE ; Jeong Hyuck AHN ; Woong Yang PARK
Genomics & Informatics 2007;5(1):1-5
Embryonic stem cells can be differentiated into various types of cells, requiring a tight regulation of transcription. Biomarkers related to each lineage of cells are used to guide the differentiation into neural or any other fates. In previous experiments, we reported the guided differentiation (GD)-specific genes by comparing profiles of random differentiation (RD). Interestingly 68% of differentially expressed genes in GD overlap with that of RD, which makes it difficult for us to separate the lineages by examining several markers. In this paper, we design a prediction model to identify the differentiation into neural fates from any other lineage. From the profiles of 11,376 genes, 203 differentially expressed genes between neural and random differentiation were selected by random variance T-test with 95% confidence and 5% false discovery rate. Based on support vector machine algorithm, we could select 79 marker genes from the 203 informative genes to construct the optimal prediction model. Here we propose a prediction model for the prediction of neural fates from random differentiation which is constructed with a perfect accuracy.
Embryonic Stem Cells
;
Stem Cells
;
Support Vector Machine
;
Biomarkers
8.Tumor segmentation on multi-modality magnetic resonance images based on SVM model parameter optimization.
Xiaochun WANG ; Jing HUANG ; Feng YANG ; Man LUO
Journal of Southern Medical University 2014;34(5):641-645
OBJECTIVETo develop a method for tumor segmentation on multi-modality magnetic resonance (MR) images based on parameter optimization of SVM model.
METHODSEach one of the 4 sub-classifiers was trained using the feature information in mono-modality MR images and applied to the corresponding modality images. The classification results differed due to different information in the selected support vectors of the mono-modality images. By modifying the weight values of the error data points, we chose the best weight values of the sub-classifier to obtain a weighed combination SVM classifier of multi-modalities for use in MR image segmentation.
RESULTSThis tumor image segmentation method was validated on the MR images of brain tumors in 34 patients and resulted in an average classification accuracy of 90.59%. Compared with the 4 mono-modality classifiers, multi-modality RBF kernel SVM classifiers increased the overall accuracy by 5.76%-20.11%.
CONCLUSIONThe proposed method combines multi-modality images with SVM classifiers to allow accurate tumor image segmentation from MR images with a high precision.
Brain Neoplasms ; diagnosis ; Humans ; Magnetic Resonance Spectroscopy ; Support Vector Machine
9.Automatic Sleep Staging Method Based on Energy Features and Least Squares Support Vector Machine Classifier.
Qunxia GAO ; Jing ZHOU ; Binggang YE ; Xiaoming WU
Journal of Biomedical Engineering 2015;32(3):531-536
The research of sleep staging is not only the basis of diagnosing sleep related diseases, but also the precondition of evaluating sleep quality, and has important clinical significance. In recent years, the research of automatic sleep staging based on computer has become a hotspot and made some achievements. Feature extraction and feature classification are two key technologies in automatic sleep staging system. In order to achieve effective automatic sleep staging, we proposed a new automatic sleep staging method which combines the energy features and least squares support vector machines (LS-SVM). Firstly, we used FIR band-pass filter to extract the energy features of Pz-Oz channel sleep electroencephalogram (EEG) signals, and compared them with those from wavelet packet transform method. Then we designed an LS-SVM classifier to realize the automatic sleep stage classification. The research showed that FIR band-pass filter (with the Kaiser window) performed better than wavelet packet transform (WPT) for energy feature extraction just in terms of the data from the Sleep-EDF Database and the LS-SVM classifier (with the RBF Kernel function) designed was good, and the automatic sleep staging method proposed in this paper was better than many similar methods from other studies with an average accuracy of 88.89% and had a very prosperous application future.
Electroencephalography
;
Humans
;
Least-Squares Analysis
;
Sleep Stages
;
Support Vector Machine
10.Study on Sleep Staging Based on Support Vector Machines and Feature Selection in Single Channel Electroencephalogram.
Xiujing LIN ; Yongming XIA ; Songrong QIAN
Journal of Biomedical Engineering 2015;32(3):503-513
Sleep electroencephalogram (EEG) is an important index in diagnosing sleep disorders and related diseases. Manual sleep staging is time-consuming and often influenced by subjective factors. Existing automatic sleep staging methods have high complexity and a low accuracy rate. A sleep staging method based on support vector machines (SVM) and feature selection using single channel EEG single is proposed in this paper. Thirty-eight features were extracted from the single channel EEG signal. Then based on the feature selection method F-Score's definition, it was extended to multiclass with an added eliminate factor in order to find proper features, which were used as SVM classifier inputs. The eliminate factor was adopted to reduce the negative interaction of features to the result. Research on the F-Score with an added eliminate factor was further accomplished with the data from a standard open source database and the results were compared with none feature selection and standard F-Score feature selection. The results showed that the present method could effectively improve the sleep staging accuracy and reduce the computation time.
Databases, Factual
;
Electroencephalography
;
Humans
;
Sleep Stages
;
Support Vector Machine