RNA secondary structure prediction based on support vector machine classification.
- Author:
Yingjie ZHAO
1
;
Zhengzhi WANG
Author Information
1. College of Mechatronics Engineering and Automation, National University of Defense Technology, Changsha 410073, China. matriz@163.com
- Publication Type:Journal Article
- MeSH:
Algorithms;
Artificial Intelligence;
Base Pairing;
Computational Biology;
methods;
RNA;
chemistry;
classification;
Sequence Alignment;
methods;
statistics & numerical data;
Sequence Analysis, RNA;
Thermodynamics
- From:
Chinese Journal of Biotechnology
2008;24(7):1140-1148
- CountryChina
- Language:Chinese
-
Abstract:
The comparative sequence analysis is the most reliable method for RNA secondary structure prediction, and many algorithms based on it have been developed in last several decades. This paper considers RNA structure prediction as a 2-classes classification problem: given a sequence alignment, to decide whether or not two columns of alignment form a base pair. We employed Support Vector Machine (SVM) to predict potential paired sites, and selected co-variation information, thermodynamic information and the fraction of complementary bases as feature vectors. Considering the effect of sequence similarity upon co-variation score, we introduced a similarity weight factor, which could adjust the contribution of co-variation and thermodynamic information toward prediction according to sequence similarity. The test on 49 Rfam-seed alignments showed the effectiveness of our method, and the accuracy was better than many similar algorithms. Furthermore, this method could predict simple pseudoknot.