A data-driven method for syndrome type identification and classification in traditional Chinese medicine.
10.1016/S2095-4964(17)60328-5
- Author:
Nevin Lianwen ZHANG
1
;
Chen FU
2
;
Teng Fei LIU
1
;
Bao-Xin CHEN
2
;
Kin Man POON
3
;
Pei Xian CHEN
1
;
Yun-Ling ZHANG
2
Author Information
1. Department of Computer Science and Engineering, the Hong Kong University of Science and Technology, Hong Kong, China.
2. Department of Neurology, Dongfang Hospital, Beijing University of Chinese Medicine, Beijing 100078, China.
3. Department of Mathematics and Information Technology, the Education University of Hong Kong, Hong Kong, China.
- Publication Type:Journal Article
- MeSH:
Data Collection;
Data Interpretation, Statistical;
Diagnosis, Differential;
Humans;
Medicine, Chinese Traditional
- From:
Journal of Integrative Medicine
2017;15(2):110-123
- CountryChina
- Language:English
-
Abstract:
The efficacy of traditional Chinese medicine (TCM) treatments for Western medicine (WM) diseases relies heavily on the proper classification of patients into TCM syndrome types. The authors developed a data-driven method for solving the classification problem, where syndrome types were identified and quantified based on statistical patterns detected in unlabeled symptom survey data. The new method is a generalization of latent class analysis (LCA), which has been widely applied in WM research to solve a similar problem, i.e., to identify subtypes of a patient population in the absence of a gold standard. A well-known weakness of LCA is that it makes an unrealistically strong independence assumption. The authors relaxed the assumption by first detecting symptom co-occurrence patterns from survey data and used those statistical patterns instead of the symptoms as features for LCA. This new method consists of six steps: data collection, symptom co-occurrence pattern discovery, statistical pattern interpretation, syndrome identification, syndrome type identification and syndrome type classification. A software package called Lantern has been developed to support the application of the method. The method was illustrated using a data set on vascular mild cognitive impairment.