Thyroid disorder classification using machine learning models
- Author:
Vincent Peter C. Magboo
1
;
Ma. Sheila A. Magboo
1
Author Information
1. Department of Physical Sciences and Mathematics, University of the Philippines Manila
- Publication Type:Journal Article
- Keywords:
Thyroid disorders;
Feature importance;
SMOTE;
XGBoost
- MeSH:
Machine Learning
- From:
The Philippine Journal of Nuclear Medicine
2022;17(2):54-61
- CountryPhilippines
- Language:English
-
Abstract:
Introduction:Thyroid hormones are produced by the thyroid gland and are essential for regulating the basal metabolic rate.
Abnormalities in the levels of these hormones lead to two classes of thyroid diseases – hyperthyroidism and
hypothyroidism. Detection and monitoring of these two general classes of thyroid diseases require accurate
measurement and interpretation of thyroid function tests. The clinical utility of machine learning models to
predict a class of thyroid disorders has not been fully elucidated.
Objective:The objective of this study is to develop machine learning models that classify the type of thyroid disorder on a
publicly available thyroid disease dataset extracted from a machine learning data repository.
Methods:Several machine learning algorithms for classifying thyroid disorders were utilized after a series of
pre-processing steps applied on the dataset.
Results:The best performing model was obtained by with XGBoost with a 99% accuracy and showing very good recall,
precision, and F1-scores for each of the three thyroid classes. Generally, all models with the exception of Naïve
Bayes did well in predicting the negative class generating over 90% in all metrics. For predicting
hypothyroidism, XGBoost, decision tree and random forest obtained the most superior performance with
metric values ranging from 96-100%. On the other end in predicting hyperthyroidism, all models have lower
classification performance as compared to the negative and hypothyroid classes Needless to say, XGBoost and
random forest did obtain good metric values ranging from 71-89% in predicting hyperthyroid class.
Conclusion:The findings of this study were encouraging and had generated useful insights in the application and
development of faster automated models with high reliability which can be of use to clinicians in the
assessment of thyroid diseases. The early and prompt clinical assessment coupled with the integration of these
machine learning models in practice can be used to determine prompt and precise diagnosis and to formulate
personalized treatment options to ensure the best quality of care to our patients.
- Full text:17 (2) article 5.pdf