Sequence Signatures of Nucleosome Positioning in Caenorhabditis elegans
10.1016/S1672-0229(10)60010-1
- Author:
Chen KAIFU
1
;
Wang LEI
;
Yang MENG
;
Liu JIUCHENG
;
Xin CHENGQI
;
Hu SONGNIAN
;
Yu JUN
Author Information
1. CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, China;Graduate University of Chinese Academy of Sciences, Beijing 100049, China
- Keywords:
nucleosome positioning;
sequence signature;
periodicity;
HMM
- From:
Genomics, Proteomics & Bioinformatics
2010;08(2):92-102
- CountryChina
- Language:Chinese
-
Abstract:
Our recent investigation in the protist Trichomonas vaginalis suggested a DNA sequence periodicity with a unit length of 120.9 nt, which represents a sequence signature for nucleosome positioning. We now extended our observation in higher eukaryotes and identified a similar periodicity of 175 nt in length in Caenorhabditis elegans. In the process of defining the sequence compositional characteristics, we found that the 10.5-nt periodicity, the sequence signature of DNA double helix, may not be sufficient for cross-nucleosome positioning but provides essential guiding rails to facilitate positioning. We further dissected nucleosome-protected sequences and identified a strong positive purine (AG) gradient from the 5'-end to the 3'-end, and also learnt that the nucleosome-enriched regions are GC-rich as compared to the nucleosome-free sequences as purine content is positively correlated with GC content. Sequence characterization allowed us to develop a hidden Markov model (HMM) algorithm for decoding nucleosome positioning computationally, and based on a set of training data from the fifth chromosome of C. Elegans, our algorithm predicted 60%-70% of the well-positioned nucleosomes, which is 15%-20% higher than random positioning. We concluded that nucleosomes are not randomly positioned on DNA sequences and yet bind to different genome regions with variable stability, well-positioned nucleosomes leave sequence signatures on DNA, and statistical positioning of nucleosomes across genome can be decoded computationally based on these sequence signatures.