Thyroid disorder classification using machine learning models

Vincent Peter C Magboo; Ma Sheila A Magboo

Return

Thyroid disorder classification using machine learning models

Author: Vincent Peter C. Magboo ¹ ; Ma. Sheila A. Magboo ¹
Author Information

1. Department of Physical Sciences and Mathematics, University of the Philippines Manila
Publication Type:Journal Article
Keywords: Thyroid disorders; Feature importance; SMOTE; XGBoost
MeSH: Machine Learning
From: The Philippine Journal of Nuclear Medicine 2022;17(2):54-61
CountryPhilippines
Language:English
Abstract: Introduction:Thyroid hormones are produced by the thyroid gland and are essential for regulating the basal metabolic rate. Abnormalities in the levels of these hormones lead to two classes of thyroid diseases – hyperthyroidism and hypothyroidism. Detection and monitoring of these two general classes of thyroid diseases require accurate measurement and interpretation of thyroid function tests. The clinical utility of machine learning models to predict a class of thyroid disorders has not been fully elucidated.
Objective:The objective of this study is to develop machine learning models that classify the type of thyroid disorder on a publicly available thyroid disease dataset extracted from a machine learning data repository.
Methods:Several machine learning algorithms for classifying thyroid disorders were utilized after a series of pre-processing steps applied on the dataset.
Results:The best performing model was obtained by with XGBoost with a 99% accuracy and showing very good recall, precision, and F1-scores for each of the three thyroid classes. Generally, all models with the exception of Naïve Bayes did well in predicting the negative class generating over 90% in all metrics. For predicting hypothyroidism, XGBoost, decision tree and random forest obtained the most superior performance with metric values ranging from 96-100%. On the other end in predicting hyperthyroidism, all models have lower classification performance as compared to the negative and hypothyroid classes Needless to say, XGBoost and random forest did obtain good metric values ranging from 71-89% in predicting hyperthyroid class.
Conclusion:The findings of this study were encouraging and had generated useful insights in the application and development of faster automated models with high reliability which can be of use to clinicians in the assessment of thyroid diseases. The early and prompt clinical assessment coupled with the integration of these machine learning models in practice can be used to determine prompt and precise diagnosis and to formulate personalized treatment options to ensure the best quality of care to our patients.
Full text:17 (2) article 5.pdf