Effective Query Expansion using Condensed UMLS Metathesaurus for Medical Information Retrieval.
- Author:
Seung Bin HAN
1
;
Jinwook CHOI
Author Information
1. Department of Biomedical Engineering, College of Medicine, Seoul National University, Korea. jinchoi@snu.ac.kr
- Publication Type:Evaluation Studies ; Original Article
- Keywords:
Unified Medical Language System;
Metathesaurus;
Information Storage and Retrieval;
Evaluation Studies;
Computing Methodologies
- MeSH:
Brain;
Computing Methodologies;
Humans;
Information Storage and Retrieval*;
Medical Records;
Seoul;
Unified Medical Language System*;
Vocabulary;
Vocabulary, Controlled
- From:Journal of Korean Society of Medical Informatics
2004;10(1):43-53
- CountryRepublic of Korea
- Language:Korean
-
Abstract:
Medical vocabularies in medical records are used in several synonyms and various expressions even though they are same concepts. Query expansion using a thesaurus enhances recall of medical information retrieval (IR) system for searching patient records or literatures. This study proposed IR system architecture applied the Metathesaurus of Unified Medical Language System (UMLS). To enhance the retrieval effectiveness at the same time to reduce retrieval time, we reconstructed condensed Metathesaurus (CMT), which is constituted of frequently used terms in medical records. We used 40,000 radiology reports of Brain CT/MRI at Seoul National University Hospital. The retrieval model we used is the Boolean methods. The results showed 15~27% effectiveness for searching relevant documents implementing the UMLS MT into IR system for query expansion. But it took 3.5 times longer for retrieval compared with non-MT implemented IR system. When we applied the CMT into IR system, however, the retrieval time reduced by 50% and the retrieval performance decreased only 8.7% compared with all MT implemented IR system. In this paper, we developed the medical document retrieval system applied UMLS MT for query expansion methods that can improve the relevant document retrieval performance, at the same time it can reduce the retrieval time through consisting condensed Metathesaurus for a specific domain.