1.A Study on the Named Entity Recognition Method on Symptom Names in the History of Present Illness in Traditional Chinese Medical (TCM) Clinic
Yuhu YUAN ; Xuezhong ZHOU ; Runshun ZHANG ; Xiaodong LI
World Science and Technology-Modernization of Traditional Chinese Medicine 2017;19(1):70-77
Clinical cases of TCM are used as important clinical data to record the whole process of the interaction between doctors and patients in the form of text.However,in the context of big data,there is a lack of research on the use of information covered in clinical cases.Therefore,we studied the method of extracting the symptom term from the history of present illness in TCM clinic in this paper,in order to lay the foundation for the further use of clinical cases.First,twelve thousand,three hundred and sixty-seven history data of present illness were obtained by random selection and expert review.According to the different disease types,they were divided into the two groups of the experiments:4,838 data in the diabetes group,7,529 data in the spleen and stomach disease group and 12,367 data in the mixed or combined group.A glossary of symptom terms covering 22,996 words were compiled.Then,five feature templates,such as sliding window feature,prefix and suffix character and lexical features,were selected.CRFs model was adopted to carry out named entity extraction experiment.As a result,in the open test,the performance of diabetes,spleen and stomach disease and mixed group were (0.83,0.8,0.82),(0.9,0.9,0.89) and (0.88,0.87,0.87),respectively,while the results were (0.83,0.82,0.83),(0.95,0.95,0.95) and (0.93,0.92,0.92) in the ten-fold cross validation.In conclusion,the results showed that the CRFs algorithm was an excellent sequence labeling algorithm and applied to the named entity extraction task of symptom history.
2.Distribution of CAG repeat number within androgen receptor gene in Chinese Han nationality and its application in genetic diagnosis for Kennedy's disease
Yuhu ZHANG ; Kun NIE ; Yanbo YUAN ; Xin WAN ; Rong GAN ; Jiehao ZHAO ; Zhiheng HUANG ; Limin WANG ; Lijuan WANG
Chinese Journal of Geriatrics 2011;30(12):1024-1026
Objective To investigate the distribution of androgen receptor (AR) gene CAGrepeats in the Chinese Han nationality and its application in genetic diagnosis for Kennedy's disease (KD). MethodsRT-PCR,denaturing polyacrylamide gel electrophoresis (DPAGE) and gene sequencing were conducted for AR gene CAG repetition among 100 healthy controls and 28 patients diagnosed as motorneuron diseases,and the number of the repetition was counted. Results The healthy controls had a range of 15-31 times of CAG repetition,with an average of (23 ± 3) times.Among patients with motoneuron disease,3 cases with CAG repetition for more than 40 times (namely,46,47 and 47 times) were diagnosed as KD.The main clinical manifestations included slow progress of limb weakness,primarily in the proximal lower limbs,fatigue accompanied by myalgia,muscle jumping,muscle atrophy,elevated serum creatine kinase (CK) levels,neurogenic damage revealed by electromyogram (EMG) and androgen insensitivity.Conclusions The incidence of KDmay be underestimated in the Chinese population.Performing genetic diagnosis in patients with motor neuron disease for AR gene can improve clinical diagnosis and avoid misdiagnosis.