1.Heart Alert: A heart disease prediction system using machine learning approach and optimization techniques
Justin Allen P. Denopol ; Ma. Sheila A. Magboo ; Vincent Peter C. Magboo
Philippine Journal of Health Research and Development 2022;26(3):83-92
Background:
Cardiovascular diseases belong to the top three leading causes of mortality in the Philippines with 17.8 % of the total deaths. Lifestyle-related habits such as alcohol consumption, smoking, poor diet and nutrition, high sedentary behavior, overweight, and obesity have been increasingly implicated in the high rates of heart disease among Filipinos leading to a significant burden to the country's healthcare system. The objective of this study was to predict the presence of heart disease using various machine learning algorithms (support vector machine, naïve Bayes, random forest, logistic regression, decision tree, and adaptive boosting) evaluated on an anonymized publicly available cardiovascular disease dataset.
Methodology:
Various machine learning algorithms were applied on an anonymized publicly available
cardiovascular dataset from a machine learning data repository (IEEE Dataport). A web-based application
system named Heart Alert was developed based on the best machine learning model that would predict the risk of developing heart disease. An assessment of the effects of different optimization techniques as to the imputation methods (mean, median, mode, and multiple imputation by chained equations) and as to the feature selection method (recursive feature elimination) on the classification performance of the machine learning algorithms was made. All simulation experiments were implemented via Python 3.8 and its machine learning libraries (Scikit-learn, Keras, Tensorflow, Pandas, Matplotlib, Seaborn, NumPy).
Results:
The support vector machine without imputation and feature selection obtained the highest
performance metrics (90.2% accuracy, 87.7% sensitivity, 93.6% specificity, 94.9% precision, 91.2% F1-score and an area under the receiver operating characteristic curve of 0.902 ) and was used to implement the heart disease prediction system (Heart Alert). Following very closely were random forest with mean or median imputation and logistic regression with mode imputation, all having no feature selection which also performed well.
Conclusion
The performance of the best four machine learning models suggests that for this dataset,
imputation technique for missing values may or may not be done. Likewise, recursive feature elimination for feature selection may not apply as all variables seem to be important in heart disease prediction. An early accurate diagnosis leading to prompt intervention efforts is very crucial as it improves the patient's quality of life and diminishes the risk of developing cardiac events.
Machine Learning
;
Support Vector Machine
2.Population Pharmacokinetic and Pharmacodynamic Models of Propofol in Healthy Volunteers using NONMEM and Machine Learning Methods.
Yoo Mi KIM ; Sung Hong KANG ; Il Su PARK ; Gyu Jeong NOH
Journal of Korean Society of Medical Informatics 2008;14(2):147-159
OBJECTIVES: The primary objective of this study is to compare model performance of machine learning methods with that of a previous study in which a nonlinear mixed effects model was created using NONMEM(R) for the pharmacokinetic and pharmacodynamic data for propofol. The secondary objective was to evaluate if a pharmacodynamic model describing the relationship between the dose of propofol and bispectral index (BIS) outperform that describing the relationship between a pharmacokinetic model derived-predicted concentrations of propofol and BIS. METHODS: Data were collected during a study involving the infusion of propofol into healthy volunteers. Pharmacokinetic and pharmacodynamic models were constructed using artificial neural networks (ANNs), support vector machines (SVMs), and multi-method ensembles and were compared with the nonlinear mixed effects method as implemented by NONMEM(R). Model performance was assessed by goodness-of-fit statistics, paired t-tests between predicted and observed values for each model and scatterplots. RESULTS: In pharmacokinetic analysis, ensemble I, the mean of ANN and NONMEM(R) predictions, achieved minimal error and the highest correlation coefficient. SVM produced the highest error and the lowest correlation coefficient. In pharmacodynamic analysis, ANN exhibited the best performance. An ANNModel describing the relationship between the dose of propofol and BIS was not inferior to an ANN model describing the relationship between predicted concentrations of propofol derived from an ANN pharmacokinetic model and BIS. CONCLUSIONS: In pharmacokinetic analysis, ensemble combined with ANN achieved slightly better performance than NONMEM(R). The relationship between the dose of propofol and BIS can be predicted without considering pharmacokinetics of propofol.
Machine Learning
;
Propofol
;
Support Vector Machine
3.MicroRNA Target Prediction Based on Support Vector Machine Ensemble Classification Algorithm of Under-sampling Technique.
Journal of Biomedical Engineering 2016;33(1):72-77
Considering the low accuracy of prediction in the positive samples and poor overall classification effects caused by unbalanced sample data of MicroRNA (miRNA) target, we proposes a support vector machine (SVM)-integration of under-sampling and weight (IUSM) algorithm in this paper, an under-sampling based on the ensemble learning algorithm. The algorithm adopts SVM as learning algorithm and AdaBoost as integration framework, and embeds clustering-based under-sampling into the iterative process, aiming at reducing the degree of unbalanced distribution of positive and negative samples. Meanwhile, in the process of adaptive weight adjustment of the samples, the SVM-IUSM algorithm eliminates the abnormal ones in negative samples with robust sample weights smoothing mechanism so as to avoid over-learning. Finally, the prediction of miRNA target integrated classifier is achieved with the combination of multiple weak classifiers through the voting mechanism. The experiment revealed that the SVM-IUSW, compared with other algorithms on unbalanced dataset collection, could not only improve the accuracy of positive targets and the overall effect of classification, but also enhance the generalization ability of miRNA target classifier.
Algorithms
;
MicroRNAs
;
chemistry
;
Support Vector Machine
4.MicroRNA target predicition based on SVM and the optimized feature set.
Baowen WANG ; Xiaoyang QI ; Changwu WANG ; Wenyuan LIU ; Yali SI
Journal of Biomedical Engineering 2013;30(6):1213-1218
MicroRNA (miRNA) is a family of endogenous single-stranded RNA about 22 nucleotides in length. Through targeting 3' UTR of message RNA (mRNA), they play important roles in post-transcriptional regulatory functions. For further research of miRNA function, the identification of more miRNA positive targets is needed urgently. Aiming at the high-dimensional small sample data sets in miRNA target prediction, an algorithm of eliminating redundant features is proposed based on v-SVM in this paper, and classification and features selection are also fused. The algorithm of eliminating redundant features optimizes the combination of features, and then constructs the best features combination which can represent miRNA and targets interaction model. The prior parameter v (0 < u < or = 1) controls the compression proportion of data set and selects more distinguishing support vectors. Finally, the classifier model of miRNA target prediction is built. The unbiased assessment of the classifier is achieved with a completely independent test dataset. Experiment results indicated that in both classification recognition and generalization performance of miRNA targets predicition, this model was superior to the present machine learning algorithms such as miTarget, NBmiRTar and TargetMiner, etc.
MicroRNAs
;
Models, Theoretical
;
Support Vector Machine
5.Classification Model of Corneal Opacity Based on Digital Image Features.
Peng LUO ; Jilong ZHENG ; Peng ZHOU ; Yongde ZHANG ; Shijie CHANG ; Xianzheng SHA
Chinese Journal of Medical Instrumentation 2021;45(4):361-365
OBJECTIVE:
According to the digital image features of corneal opacity, a multi classification model of support vector machine (SVM) was established to explore the objective quantification method of corneal opacity.
METHODS:
The cornea digital images of dead pigs were collected, part of the color features and texture features were extracted according to the previous experience, and the SVM multi classification model was established. The test results of the model were evaluated by precision, sensitivity and
RESULTS:
In the classification of corneal opacity, the highest
CONCLUSIONS
The SVM multi classification model can classify the degree of corneal opacity.
Animals
;
Corneal Opacity
;
Support Vector Machine
;
Swine
6.Hierarchical Classification of ECG Beat Using Higher Order Statistics and Hermite Model.
Kwan Soo PARK ; Baek Hwan CHO ; Do Hoon LEE ; Su Hwa SONG ; Jong Shill LEE ; Young Joon CHEE ; In Young KIM ; Sun I KIM
Journal of Korean Society of Medical Informatics 2009;15(1):117-131
OBJECTIVE: The heartbeat classification of the electrocardiogram is important in cardiac disease diagnosis. For detecting QRS complex, conventional detection algorithmhave been designed to detect P, QRS, Twave, first. However, the detection of the P and T wave is difficult because their amplitudes are relatively low, and occasionally they are included in noise. Furthermore the conventionalmulticlass classificationmethodmay have skewed results to themajority class, because of unbalanced data distribution. METHODS: The Hermite model of the higher order statistics is good characterization methods for recognizing morphological QRS complex. We applied three morphological feature extraction methods for detecting QRS complex: higher-order statistics, Hermite basis functions andHermitemodel of the higher order statistics.Hierarchical scheme tackle the unbalanced data distribution problem. We also employed a hierarchical classification method using support vector machines. RESULTS:We compared classification methods with feature extraction methods. As a result, our mean values of sensitivity for hierarchical classification method (75.47%, 76.16% and 81.21%) give better performance than the conventionalmulticlass classificationmethod (46.16%). In addition, theHermitemodel of the higher order statistics gave the best results compared to the higher order statistics and the Hermite basis functions in the hierarchical classification method. CONCLUSION: This research suggests that the Hermite model of the higher order statistics is feasible for heartbeat feature extraction. The hierarchical classification is also feasible for heartbeat classification tasks that have the unbalanced data distribution.
Classification*
;
Diagnosis
;
Electrocardiography*
;
Heart Diseases
;
Noise
;
Support Vector Machine
7.Detection of Neural Fates from Random Differentiation: Application of Support Vector MachineMin.
Min Su LEE ; Jeong Hyuck AHN ; Woong Yang PARK
Genomics & Informatics 2007;5(1):1-5
Embryonic stem cells can be differentiated into various types of cells, requiring a tight regulation of transcription. Biomarkers related to each lineage of cells are used to guide the differentiation into neural or any other fates. In previous experiments, we reported the guided differentiation (GD)-specific genes by comparing profiles of random differentiation (RD). Interestingly 68% of differentially expressed genes in GD overlap with that of RD, which makes it difficult for us to separate the lineages by examining several markers. In this paper, we design a prediction model to identify the differentiation into neural fates from any other lineage. From the profiles of 11,376 genes, 203 differentially expressed genes between neural and random differentiation were selected by random variance T-test with 95% confidence and 5% false discovery rate. Based on support vector machine algorithm, we could select 79 marker genes from the 203 informative genes to construct the optimal prediction model. Here we propose a prediction model for the prediction of neural fates from random differentiation which is constructed with a perfect accuracy.
Embryonic Stem Cells
;
Stem Cells
;
Support Vector Machine
;
Biomarkers
8.Origin identification of Poria cocos based on hyperspectral imaging technology.
Xue SUN ; Deng-Ting ZHANG ; Hui WANG ; Cong ZHOU ; Jian YANG ; Dai-Yin PENG ; Xiao-Bo ZHANG
China Journal of Chinese Materia Medica 2023;48(16):4337-4346
To realize the non-destructive and rapid origin discrimination of Poria cocos in batches, this study established the P. cocos origin recognition model based on hyperspectral imaging combined with machine learning. P. cocos samples from Anhui, Fujian, Guangxi, Hubei, Hunan, Henan and Yunnan were used as the research objects. Hyperspectral data were collected in the visible and near infrared band(V-band, 410-990 nm) and shortwave infrared band(S-band, 950-2 500 nm). The original spectral data were divided into S-band, V-band and full-band. With the original data(RD) of different bands, multiplicative scatter correction(MSC), standard normal variation(SNV), S-G smoothing(SGS), first derivative(FD), second derivative(SD) and other pretreatments were carried out. Then the data were classified according to three different types of producing areas: province, county and batch. The origin identification model was established by partial least squares discriminant analysis(PLS-DA) and linear support vector machine(LinearSVC). Finally, confusion matrix was employed to evaluate the optimal model, with F1 score as the evaluation standard. The results revealed that the origin identification model established by FD combined with LinearSVC had the highest prediction accuracy in full-band range classified by province, V-band range by county and full-band range by batch, which were 99.28%, 98.55% and 97.45%, respectively, and the overall F1 scores of these three models were 99.16%, 98.59% and 97.58%, respectively, indicating excellent performance of these models. Therefore, hyperspectral imaging combined with LinearSVC can realize the non-destructive, accurate and rapid identification of P. cocos from different producing areas in batches, which is conducive to the directional research and production of P. cocos.
Hyperspectral Imaging
;
Wolfiporia
;
China
;
Least-Squares Analysis
;
Support Vector Machine
9.Gastrointestinal polyp detection in endoscopic images using an improved feature extraction method
Mustain BILLAH ; Sajjad WAHEED
Biomedical Engineering Letters 2018;8(1):69-75
Gastrointestinal polyps are treated as the precursors of cancer development. So, possibility of cancers can be reduced at a great extent by early detection and removal of polyps. The most used diagnostic modality for gastrointestinal polyps is video endoscopy. But, as an operator dependant procedure, several human factors can lead to miss detection of polyps. In this peper, an improved computer aided polyp detection method has been proposed. Proposed improved method can reduce polyp miss detection rate and assists doctors in finding the most important regions to pay attention. Color wavelet features and convolutional neural network features are extracted from endoscopic images, which are used for training a support vector machine. Then a target endoscopic image will be given to the classifier as input in order to find whether it contains any polyp or not. If polyp is found, it will be marked automatically. Experiment shows that, color wavelet features and convolutional neural network features together construct a highly representative of endoscopic polyp images. Evaluations on standard public databases show that, proposed system outperforms state-of-the-art methods, gaining accuracy of 98.34%, sensitivity of 98.67% and specificity of 98.23%. In this paper, the strength of color wavelet features and power of convolutional neural network features are combined. Fusion of these two methodology and use of support vector machine results in an improved method for gastrointestinal polyp detection. An analysis of ROC reveals that, proposed method can be used for polyp detection purposes with greater accuracy than state-of-the-art methods.
Endoscopy
;
Humans
;
Methods
;
Polyps
;
Sensitivity and Specificity
;
Support Vector Machine
10.A pace recognition method for exoskeleton wearers based on support vector machine-hidden Markov model.
Dong HU ; Zuojun LIU ; Lingling CHEN ; Qian WANG
Journal of Biomedical Engineering 2022;39(1):84-91
In order to improve the motion fluency and coordination of lower extremity exoskeleton robots and wearers, a pace recognition method of exoskeleton wearer is proposed base on inertial sensors. Firstly, the triaxial acceleration and triaxial angular velocity signals at the thigh and calf were collected by inertial sensors. Then the signal segment of 0.5 seconds before the current time was extracted by the time window method. And the Fourier transform coefficients in the frequency domain signal were used as eigenvalues. Then the support vector machine (SVM) and hidden Markov model (HMM) were combined as a classification model, which was trained and tested for pace recognition. Finally, the pace change rule and the human-machine interaction force were combined in this model and the current pace was predicted by the model. The experimental results showed that the pace intention of the lower extremity exoskeleton wearer could be effectively identified by the method proposed in this article. And the recognition rate of the seven pace patterns could reach 92.14%. It provides a new way for the smooth control of the exoskeleton.
Algorithms
;
Exoskeleton Device
;
Humans
;
Lower Extremity
;
Motion
;
Support Vector Machine