1.Comparative genomic study reveals a transition from TA richness in invertebrates to GC richness in vertebrates at CpG flanking sites: an indication for context-dependent mutagenicity of methylated CpG sites.
Yong WANG ; Frederick C C LEUNG
Genomics, Proteomics & Bioinformatics 2008;6(3-4):144-154
Vertebrate genomes are characterized with CpG deficiency, particularly for GC-poor regions. The GC content-related CpG deficiency is probably caused by context-dependent deamination of methylated CpG sites. This hypothesis was examined in this study by comparing nucleotide frequencies at CpG flanking positions among invertebrate and vertebrate genomes. The finding is a transition of nucleotide preference of 5' T to 5' A at the invertebrate-vertebrate boundary, indicating that a large number of CpG sites with 5' Ts were depleted because of global DNA methylation developed in vertebrates. At genome level, we investigated CpG observed/expected (obs/exp) values in 500 bp fragments, and found that higher CpG obs/exp value is shown in GC-poor regions of invertebrate genomes (except sea urchin) but in GC-rich sequences of vertebrate genomes. We next compared GC content at CpG flanking positions with genomic average, showing that the GC content is lower than the average in invertebrate genomes, but higher than that in vertebrate genomes. These results indicate that although 5' T and 5' A are different in inducing deamination of methylated CpG sites, GC content is even more important in affecting the deamination rate. In all the tests, the results of sea urchin are similar to vertebrates perhaps due to its fractional DNA methylation. CpG deficiency is therefore suggested to be mainly a result of high mutation rates of methylated CpG sites in GC-poor regions.
AT Rich Sequence
;
Animals
;
CpG Islands
;
genetics
;
DNA Methylation
;
GC Rich Sequence
;
Gene Frequency
;
Genome
;
Genomics
;
methods
;
Humans
;
Invertebrates
;
genetics
;
Isochores
;
genetics
;
Mutation
;
Vertebrates
;
genetics