1.Heart Alert: A heart disease prediction system using machine learning approach and optimization techniques
Justin Allen P. Denopol ; Ma. Sheila A. Magboo ; Vincent Peter C. Magboo
Philippine Journal of Health Research and Development 2022;26(3):83-92
Background:
Cardiovascular diseases belong to the top three leading causes of mortality in the Philippines with 17.8 % of the total deaths. Lifestyle-related habits such as alcohol consumption, smoking, poor diet and nutrition, high sedentary behavior, overweight, and obesity have been increasingly implicated in the high rates of heart disease among Filipinos leading to a significant burden to the country's healthcare system. The objective of this study was to predict the presence of heart disease using various machine learning algorithms (support vector machine, naïve Bayes, random forest, logistic regression, decision tree, and adaptive boosting) evaluated on an anonymized publicly available cardiovascular disease dataset.
Methodology:
Various machine learning algorithms were applied on an anonymized publicly available
cardiovascular dataset from a machine learning data repository (IEEE Dataport). A web-based application
system named Heart Alert was developed based on the best machine learning model that would predict the risk of developing heart disease. An assessment of the effects of different optimization techniques as to the imputation methods (mean, median, mode, and multiple imputation by chained equations) and as to the feature selection method (recursive feature elimination) on the classification performance of the machine learning algorithms was made. All simulation experiments were implemented via Python 3.8 and its machine learning libraries (Scikit-learn, Keras, Tensorflow, Pandas, Matplotlib, Seaborn, NumPy).
Results:
The support vector machine without imputation and feature selection obtained the highest
performance metrics (90.2% accuracy, 87.7% sensitivity, 93.6% specificity, 94.9% precision, 91.2% F1-score and an area under the receiver operating characteristic curve of 0.902 ) and was used to implement the heart disease prediction system (Heart Alert). Following very closely were random forest with mean or median imputation and logistic regression with mode imputation, all having no feature selection which also performed well.
Conclusion
The performance of the best four machine learning models suggests that for this dataset,
imputation technique for missing values may or may not be done. Likewise, recursive feature elimination for feature selection may not apply as all variables seem to be important in heart disease prediction. An early accurate diagnosis leading to prompt intervention efforts is very crucial as it improves the patient's quality of life and diminishes the risk of developing cardiac events.
Machine Learning
;
Support Vector Machine
2.Population Pharmacokinetic and Pharmacodynamic Models of Propofol in Healthy Volunteers using NONMEM and Machine Learning Methods.
Yoo Mi KIM ; Sung Hong KANG ; Il Su PARK ; Gyu Jeong NOH
Journal of Korean Society of Medical Informatics 2008;14(2):147-159
OBJECTIVES: The primary objective of this study is to compare model performance of machine learning methods with that of a previous study in which a nonlinear mixed effects model was created using NONMEM(R) for the pharmacokinetic and pharmacodynamic data for propofol. The secondary objective was to evaluate if a pharmacodynamic model describing the relationship between the dose of propofol and bispectral index (BIS) outperform that describing the relationship between a pharmacokinetic model derived-predicted concentrations of propofol and BIS. METHODS: Data were collected during a study involving the infusion of propofol into healthy volunteers. Pharmacokinetic and pharmacodynamic models were constructed using artificial neural networks (ANNs), support vector machines (SVMs), and multi-method ensembles and were compared with the nonlinear mixed effects method as implemented by NONMEM(R). Model performance was assessed by goodness-of-fit statistics, paired t-tests between predicted and observed values for each model and scatterplots. RESULTS: In pharmacokinetic analysis, ensemble I, the mean of ANN and NONMEM(R) predictions, achieved minimal error and the highest correlation coefficient. SVM produced the highest error and the lowest correlation coefficient. In pharmacodynamic analysis, ANN exhibited the best performance. An ANNModel describing the relationship between the dose of propofol and BIS was not inferior to an ANN model describing the relationship between predicted concentrations of propofol derived from an ANN pharmacokinetic model and BIS. CONCLUSIONS: In pharmacokinetic analysis, ensemble combined with ANN achieved slightly better performance than NONMEM(R). The relationship between the dose of propofol and BIS can be predicted without considering pharmacokinetics of propofol.
Machine Learning
;
Propofol
;
Support Vector Machine
3.MicroRNA target predicition based on SVM and the optimized feature set.
Baowen WANG ; Xiaoyang QI ; Changwu WANG ; Wenyuan LIU ; Yali SI
Journal of Biomedical Engineering 2013;30(6):1213-1218
MicroRNA (miRNA) is a family of endogenous single-stranded RNA about 22 nucleotides in length. Through targeting 3' UTR of message RNA (mRNA), they play important roles in post-transcriptional regulatory functions. For further research of miRNA function, the identification of more miRNA positive targets is needed urgently. Aiming at the high-dimensional small sample data sets in miRNA target prediction, an algorithm of eliminating redundant features is proposed based on v-SVM in this paper, and classification and features selection are also fused. The algorithm of eliminating redundant features optimizes the combination of features, and then constructs the best features combination which can represent miRNA and targets interaction model. The prior parameter v (0 < u < or = 1) controls the compression proportion of data set and selects more distinguishing support vectors. Finally, the classifier model of miRNA target prediction is built. The unbiased assessment of the classifier is achieved with a completely independent test dataset. Experiment results indicated that in both classification recognition and generalization performance of miRNA targets predicition, this model was superior to the present machine learning algorithms such as miTarget, NBmiRTar and TargetMiner, etc.
MicroRNAs
;
Models, Theoretical
;
Support Vector Machine
4.MicroRNA Target Prediction Based on Support Vector Machine Ensemble Classification Algorithm of Under-sampling Technique.
Journal of Biomedical Engineering 2016;33(1):72-77
Considering the low accuracy of prediction in the positive samples and poor overall classification effects caused by unbalanced sample data of MicroRNA (miRNA) target, we proposes a support vector machine (SVM)-integration of under-sampling and weight (IUSM) algorithm in this paper, an under-sampling based on the ensemble learning algorithm. The algorithm adopts SVM as learning algorithm and AdaBoost as integration framework, and embeds clustering-based under-sampling into the iterative process, aiming at reducing the degree of unbalanced distribution of positive and negative samples. Meanwhile, in the process of adaptive weight adjustment of the samples, the SVM-IUSM algorithm eliminates the abnormal ones in negative samples with robust sample weights smoothing mechanism so as to avoid over-learning. Finally, the prediction of miRNA target integrated classifier is achieved with the combination of multiple weak classifiers through the voting mechanism. The experiment revealed that the SVM-IUSW, compared with other algorithms on unbalanced dataset collection, could not only improve the accuracy of positive targets and the overall effect of classification, but also enhance the generalization ability of miRNA target classifier.
Algorithms
;
MicroRNAs
;
chemistry
;
Support Vector Machine
5.Classification Model of Corneal Opacity Based on Digital Image Features.
Peng LUO ; Jilong ZHENG ; Peng ZHOU ; Yongde ZHANG ; Shijie CHANG ; Xianzheng SHA
Chinese Journal of Medical Instrumentation 2021;45(4):361-365
OBJECTIVE:
According to the digital image features of corneal opacity, a multi classification model of support vector machine (SVM) was established to explore the objective quantification method of corneal opacity.
METHODS:
The cornea digital images of dead pigs were collected, part of the color features and texture features were extracted according to the previous experience, and the SVM multi classification model was established. The test results of the model were evaluated by precision, sensitivity and
RESULTS:
In the classification of corneal opacity, the highest
CONCLUSIONS
The SVM multi classification model can classify the degree of corneal opacity.
Animals
;
Corneal Opacity
;
Support Vector Machine
;
Swine
6.Hierarchical Classification of ECG Beat Using Higher Order Statistics and Hermite Model.
Kwan Soo PARK ; Baek Hwan CHO ; Do Hoon LEE ; Su Hwa SONG ; Jong Shill LEE ; Young Joon CHEE ; In Young KIM ; Sun I KIM
Journal of Korean Society of Medical Informatics 2009;15(1):117-131
OBJECTIVE: The heartbeat classification of the electrocardiogram is important in cardiac disease diagnosis. For detecting QRS complex, conventional detection algorithmhave been designed to detect P, QRS, Twave, first. However, the detection of the P and T wave is difficult because their amplitudes are relatively low, and occasionally they are included in noise. Furthermore the conventionalmulticlass classificationmethodmay have skewed results to themajority class, because of unbalanced data distribution. METHODS: The Hermite model of the higher order statistics is good characterization methods for recognizing morphological QRS complex. We applied three morphological feature extraction methods for detecting QRS complex: higher-order statistics, Hermite basis functions andHermitemodel of the higher order statistics.Hierarchical scheme tackle the unbalanced data distribution problem. We also employed a hierarchical classification method using support vector machines. RESULTS:We compared classification methods with feature extraction methods. As a result, our mean values of sensitivity for hierarchical classification method (75.47%, 76.16% and 81.21%) give better performance than the conventionalmulticlass classificationmethod (46.16%). In addition, theHermitemodel of the higher order statistics gave the best results compared to the higher order statistics and the Hermite basis functions in the hierarchical classification method. CONCLUSION: This research suggests that the Hermite model of the higher order statistics is feasible for heartbeat feature extraction. The hierarchical classification is also feasible for heartbeat classification tasks that have the unbalanced data distribution.
Classification*
;
Diagnosis
;
Electrocardiography*
;
Heart Diseases
;
Noise
;
Support Vector Machine
7.Tumor segmentation on multi-modality magnetic resonance images based on SVM model parameter optimization.
Xiaochun WANG ; Jing HUANG ; Feng YANG ; Man LUO
Journal of Southern Medical University 2014;34(5):641-645
OBJECTIVETo develop a method for tumor segmentation on multi-modality magnetic resonance (MR) images based on parameter optimization of SVM model.
METHODSEach one of the 4 sub-classifiers was trained using the feature information in mono-modality MR images and applied to the corresponding modality images. The classification results differed due to different information in the selected support vectors of the mono-modality images. By modifying the weight values of the error data points, we chose the best weight values of the sub-classifier to obtain a weighed combination SVM classifier of multi-modalities for use in MR image segmentation.
RESULTSThis tumor image segmentation method was validated on the MR images of brain tumors in 34 patients and resulted in an average classification accuracy of 90.59%. Compared with the 4 mono-modality classifiers, multi-modality RBF kernel SVM classifiers increased the overall accuracy by 5.76%-20.11%.
CONCLUSIONThe proposed method combines multi-modality images with SVM classifiers to allow accurate tumor image segmentation from MR images with a high precision.
Brain Neoplasms ; diagnosis ; Humans ; Magnetic Resonance Spectroscopy ; Support Vector Machine
8.Detection of Neural Fates from Random Differentiation: Application of Support Vector MachineMin.
Min Su LEE ; Jeong Hyuck AHN ; Woong Yang PARK
Genomics & Informatics 2007;5(1):1-5
Embryonic stem cells can be differentiated into various types of cells, requiring a tight regulation of transcription. Biomarkers related to each lineage of cells are used to guide the differentiation into neural or any other fates. In previous experiments, we reported the guided differentiation (GD)-specific genes by comparing profiles of random differentiation (RD). Interestingly 68% of differentially expressed genes in GD overlap with that of RD, which makes it difficult for us to separate the lineages by examining several markers. In this paper, we design a prediction model to identify the differentiation into neural fates from any other lineage. From the profiles of 11,376 genes, 203 differentially expressed genes between neural and random differentiation were selected by random variance T-test with 95% confidence and 5% false discovery rate. Based on support vector machine algorithm, we could select 79 marker genes from the 203 informative genes to construct the optimal prediction model. Here we propose a prediction model for the prediction of neural fates from random differentiation which is constructed with a perfect accuracy.
Embryonic Stem Cells
;
Stem Cells
;
Support Vector Machine
;
Biomarkers
9.Multi-feature Extraction and Classification of Breast Tumor in Ultrasound Image.
Li REN ; Yangyang LIU ; Ying TONG ; Xuehong CAO ; Yiyun WU
Chinese Journal of Medical Instrumentation 2020;44(4):294-301
OBJECTIVE:
Feature extraction of breast tumors is very important in the breast tumor detection (benign and malignant) in ultrasound image. The traditional quantitative description of breast tumors has some shortcomings, such as inaccuracy. A simple and accurate feature extraction method has been studied.
METHODS:
In this paper, a new method of boundary feature extraction was proposed. Firstly, the shape histogram of ultrasound breast tumors was constructed. Secondly, the relevant boundary feature factors were calculated from a local point of view, including sum of maximum curvature, sum of maximum curvature and peak, sum of maximum curvature and standard deviation. Based on the boundary features, shape features and texture features, the linear support vector machine classifiers for benign and malignant breast tumor recognition was constructed.
RESULTS:
The accuracy of boundary features in the benign and malignant breast tumors classification was 82.69%. The accuracy of shape features was 73.08%. The accuracy of texture features was 63.46%. The classification accuracy of the three fusion features was 86.54%.
CONCLUSIONS
The classification accuracy of boundary features was higher than that of texture features and shape features. The classification method based on multi-features has the highest accuracy and it describes the benign and malignant tumors from different angles. The research results have practical value.
Algorithms
;
Breast Neoplasms
;
diagnostic imaging
;
Humans
;
Support Vector Machine
;
Ultrasonography
10.Emotion Recognition Based on Multiple Physiological Signals.
Shali CHEN ; Liuyi ZHANG ; Feng JIANG ; Wanlin CHEN ; Jiajun MIAO ; Hang CHEN
Chinese Journal of Medical Instrumentation 2020;44(4):283-287
Emotion is a series of reactions triggered by a specific object or situation that affects a person's physiological state and can, therefore, be identified by physiological signals. This paper proposes an emotion recognition model. Extracted the features of physiological signals such as photoplethysmography, galvanic skin response, respiration amplitude, and skin temperature. The SVM-RFE-CBR(Recursive Feature Elimination-Correlation Bias Reduction-Support Vector Machine) algorithm was performed to select features and support vector machines for classification. Finally, the model was implemented on the DEAP dataset for an emotion recognition experiment. In the rating scale of valence, arousal, and dominance, the accuracy rates of 73.5%, 81.3%, and 76.1% were obtained respectively. The result shows that emotional recognition can be effectively performed by combining a variety of physiological signals.
Arousal
;
Emotions
;
Galvanic Skin Response
;
Humans
;
Photoplethysmography
;
Support Vector Machine