Development of a System for Extracting the Information of Candidate Tumor Markers Reported in Biomedical Literatures.
10.3343/kjlm.2008.28.1.79
- Author:
Jeong Min CHAE
1
;
Heung Bum OH
;
Sung Eun CHOI
;
Choong Hwan CHA
;
Myung Hee KIM
;
Soon Young JUNG
Author Information
1. Department of Computer Science Education, Korea University, Seoul, Korea.
- Publication Type:Original Article ; English Abstract
- Keywords:
Tumor;
Tumor marker;
Information extraction
- MeSH:
Abstracting and Indexing as Topic;
Algorithms;
Database Management Systems;
Humans;
*Medical Informatics Computing;
Neoplasms/metabolism;
Programming Languages;
*PubMed;
Software;
*Tumor Markers, Biological
- From:The Korean Journal of Laboratory Medicine
2008;28(1):79-87
- CountryRepublic of Korea
- Language:Korean
-
Abstract:
BACKGROUND: Since the human genome project was completed in 2003, there have been numerous reports on cancer and related markers. This study was aimed to develop a system to extract automatically information regarding the relationship between cancer and tumor markers from biomedical literatures. METHODS: Named entities of tumor markers were recognized by both a dictionary-based method and machine learning technology of the support vector machine. Named entities of cancers were recognized by the MeSH dictionary. RESULTS: Relational and filtering keywords were selected after annotating 160 abstracts from PubMed. Relational information was extracted only when one of the relational keywords was in an appropriate position along the parse tree of a sentence with both tumor marker and disease entities. The performance of the system developed in this study was evaluated with another set of 77 abstracts. With the relational and filtering keyword used in the system, precision was 94.38% and recall was 66.14%, while without the expert knowledge precision was 49.16% and recall was 69.29%. CONCLUSIONS: We developed a system that can extract relational information between a tumor and its markers by incorporating expert knowledge into the system. The system exploiting expert knowledge would serve as a reference when developing another information extraction system in various medical fields.