A Study on the Application of Natural Language Processing in Health Care Big Data: Focusing on Word Embedding Methods
10.4332/KJHPA.2020.30.1.15
- Author:
Hansang KIM
1
;
Yeojin CHUNG
Author Information
1. Review and Assessment Research Department, Health Insurance Review & Assessment Service, Wonju, Korea
- Publication Type:REVIEW ARTICLE
- From:Health Policy and Management
2020;30(1):15-25
- CountryRepublic of Korea
- Language:Korean
-
Abstract:
While healthcare data sets include extensive information about patients, many researchers have limitations in analyzing them due to their intrinsic characteristics such as heterogeneity, longitudinal irregularity, and noise. In particular, since the majority of medical history information is recorded in text codes, the use of such information has been limited due to the high dimensionality of explanatory variables. To address this problem, recent studies applied word embedding techniques, originally developed for natural language processing, and derived positive results in terms of dimensional reduction and accuracy of the prediction model. This paper reviews the deep learning-based natural language processing techniques (word embedding) and summarizes research cases that have used those techniques in the health care field. Then we finally propose a research framework for applying deep learning-based natural language process in the analysis of domestic health insurance data.