1.Concerns of Thalassemia Patients, Carriers, and their Caregivers in Malaysia: Text Mining Information Shared on Social Media
Yuen Chi PHANG ; Azleena Mohd KASSIM ; Ernest MANGANTIG
Healthcare Informatics Research 2021;27(3):200-213
Objectives:
The main aim of this study was to use text mining on social media to analyze information and gain insight into the health-related concerns of thalassemia patients, thalassemia carriers, and their caregivers.
Methods:
Posts from two Facebook groups whose members consisted of thalassemia patients, thalassemia carriers, and caregivers in Malaysia were extracted using the Data Miner tool. In this study, a new framework known as Malay-English social media text pre-processing was proposed for performing the steps of pre-processing the noisy mixed language (Malay-English language) of social media posts. Topic modeling was used to identify hidden topics within posts shared among members. Three different topic models—latent Dirichlet allocation (LDA) in GenSim, LDA in MALLET, and latent semantic analysis—were applied to the dataset with and without stemming using Python.
Results:
LDA in MALLET without stemming was found to be the best topic model for this dataset. Eight topics were identified within the posts shared by members. Of those eight topics, four were newly discovered by this study, and four others corresponded to the findings of previous studies that used an interview approach.
Conclusions
Topic 2 (the challenges faced by thalassemia patients) was found to be the topic with the highest attention and engagement. Healthcare practitioners and other concerned parties should make an effort to build a stronger support system related to this issue for those affected by thalassemia.
2.Concerns of Thalassemia Patients, Carriers, and their Caregivers in Malaysia: Text Mining Information Shared on Social Media
Yuen Chi PHANG ; Azleena Mohd KASSIM ; Ernest MANGANTIG
Healthcare Informatics Research 2021;27(3):200-213
Objectives:
The main aim of this study was to use text mining on social media to analyze information and gain insight into the health-related concerns of thalassemia patients, thalassemia carriers, and their caregivers.
Methods:
Posts from two Facebook groups whose members consisted of thalassemia patients, thalassemia carriers, and caregivers in Malaysia were extracted using the Data Miner tool. In this study, a new framework known as Malay-English social media text pre-processing was proposed for performing the steps of pre-processing the noisy mixed language (Malay-English language) of social media posts. Topic modeling was used to identify hidden topics within posts shared among members. Three different topic models—latent Dirichlet allocation (LDA) in GenSim, LDA in MALLET, and latent semantic analysis—were applied to the dataset with and without stemming using Python.
Results:
LDA in MALLET without stemming was found to be the best topic model for this dataset. Eight topics were identified within the posts shared by members. Of those eight topics, four were newly discovered by this study, and four others corresponded to the findings of previous studies that used an interview approach.
Conclusions
Topic 2 (the challenges faced by thalassemia patients) was found to be the topic with the highest attention and engagement. Healthcare practitioners and other concerned parties should make an effort to build a stronger support system related to this issue for those affected by thalassemia.