- Author:
Md Mamunur RASHID
1
;
Md Rezaul KARIM
;
Byeong Soo JEONG
;
Ho Jin CHOI
Author Information
- Publication Type:Original Article
- Keywords: DNA sequence; index-based method; information gain; pattern mining
- MeSH: Base Sequence; Computational Biology; DNA; Mining
- From:Genomics & Informatics 2012;10(1):44-50
- CountryRepublic of Korea
- Language:English
- Abstract: Pattern discovery in biological sequences (e.g., DNA sequences) is one of the most challenging tasks in computational biology and bioinformatics. So far, in most approaches, the number of occurrences is a major measure of determining whether a pattern is interesting or not. In computational biology, however, a pattern that is not frequent may still be considered very informative if its actual support frequency exceeds the prior expectation by a large margin. In this paper, we propose a new interesting measure that can provide meaningful biological information. We also propose an efficient index-based method for mining such interesting patterns. Experimental results show that our approach can find interesting patterns within an acceptable computation time.