A Multi-Classifier Based Guideline Sentence Classification System.
10.4258/hir.2011.17.4.224
- Author:
Mi Hwa SONG
1
;
Sung Hyun KIM
;
Dong Kyun PARK
;
Young Ho LEE
Author Information
1. U-Healthcare Institute, Gachon University of Medicine and Science, Incheon, Korea. leeyh@gachon.ac.kr
- Publication Type:Original Article
- Keywords:
Knowledge Bases;
Data Mining;
Natural Language Processing
- MeSH:
Data Mining;
Humans;
Hypertension;
Imidazoles;
Joints;
Knowledge Bases;
Natural Language Processing;
Nitro Compounds;
Search Engine
- From:Healthcare Informatics Research
2011;17(4):224-231
- CountryRepublic of Korea
- Language:English
-
Abstract:
OBJECTIVES: An efficient clinical process guideline (CPG) modeling service was designed that uses an enhanced intelligent search protocol. The need for a search system arises from the requirement for CPG models to be able to adapt to dynamic patient contexts, allowing them to be updated based on new evidence that arises from medical guidelines and papers. METHODS: A sentence category classifier combined with the AdaBoost.M1 algorithm was used to evaluate the contribution of the CPG to the quality of the search mechanism. Three annotators each tagged 340 sentences hand-chosen from the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure (JNC7) clinical guideline. The three annotators then carried out cross-validations of the tagged corpus. A transformation function is also used that extracts a predefined set of structural feature vectors determined by analyzing the sentential instance in terms of the underlying syntactic structures and phrase-level co-occurrences that lie beneath the surface of the lexical generation event. RESULTS: The additional sub-filtering using a combination of multi-classifiers was found to be more effective than a single conventional Term Frequency-Inverse Document Frequency (TF-IDF)-based search system in pinpointing the page containing or adjacent to the guideline information. CONCLUSIONS: We found that transformation has the advantage of exploiting the structural and underlying features which go unseen by the bag-of-words (BOW) model. We also realized that integrating a sentential classifier with a TF-IDF-based search engine enhances the search process by maximizing the probability of the automatically presented relevant information required in the context generated by the guideline authoring environment.