1.CVTree3 Web Server for Whole-genome-based and Alignment-free Prokaryotic Phylogeny and Taxonomy
Genomics, Proteomics & Bioinformatics 2015;(5):321-331
A faithful phylogeny and an objective taxonomy for prokaryotes should agree with each other and ultimately follow the genome data. With the number of sequenced genomes reaching tens of thousands, both tree inference and detailed comparison with taxonomy are great challenges. We now provide one solution in the latest Release 3.0 of the alignment-free and whole-genome-based web server CVTree3. The server resides in a cluster of 64 cores and is equipped with an interactive, collapsible, and expandable tree display. It is capable of comparing the tree branching order with prokaryotic classification at all taxonomic ranks from domains down to species and strains. CVTree3 allows for inquiry by taxon names and trial on lineage modifications. In addition, it reports a summary of monophyletic and non-monophyletic taxa at all ranks as well as produces print-quality subtree figures. After giving an overview of retrospective verification of the CVTree approach, the power of the new server is described for the mega-classification of prokaryotes and determination of taxonomic placement of some newly-sequenced genomes. A few discrepancies between CVTree and 16S rRNA analyses are also summarized with regard to possible taxonomic revisions. CVTree3 is freely accessible to all users at http://tlife.fudan.edu.cn/cvtree3/without login requirements.
2.LVTree Viewer:An Interactive Display for the All-Species Living Tree Incorporating Automatic Comparison with Prokaryotic Systematics
Zuo GUANGHONG ; Zhi XIAOYANG ; Xu ZHAO ; Hao BAILIN
Genomics, Proteomics & Bioinformatics 2016;14(2):94-102
We describe an interactive viewer for the All-Species Living Tree (LVTree). The viewer incorporates treeing and lineage information from the ARB-SILVA website. It allows collapsing the tree branches at different taxonomic ranks and expanding the collapsed branches as well, keeping the overall topology of the tree unchanged. It also enables the user to observe the consequence of trial lineage modifications by re-collapsing the tree. The system reports taxon statistics at all ranks automatically after each collapsing and re-collapsing. These features greatly facilitate the compar-ison of the 16S rRNA sequence phylogeny with prokaryotic taxonomy in a taxon by taxon manner. In view of the fact that the present prokaryotic systematics is largely based on 16S rRNA sequence analysis, the current viewer may help reveal discrepancies between phylogeny and taxonomy. As an application, we show that in the latest release of LVTree, based on 11,939 rRNA sequences, as few as 24 lineage modifications are enough to bring all but two phyla (Proteobacteria and Firmicutes) to monophyletic clusters.
3.CVTree: A Parallel Alignment-free Phylogeny and Taxonomy Tool Based on Composition Vectors of Genomes
Genomics, Proteomics & Bioinformatics 2021;19(4):662-667
Composition Vector Tree (CVTree) is an alignment-free algorithm to infer phylogenetic relationships from genome sequences. It has been successfully applied to study phylogeny and taxonomy of viruses, prokaryotes, and fungi based on the whole genomes, as well as chloroplast genomes, mitochondrial genomes, and metagenomes. Here we presented the standalone software for the CVTree algorithm. In the software, an extensible parallel workflow for the CVTree algorithm was designed. Based on the workflow, new alignment-free methods were also implemented. And by examining the phylogeny and taxonomy of 13,903 prokaryotes based on 16S rRNA sequences, we showed that CVTree software is an efficient and effective tool for studying phylogeny and taxonomy based on genome sequences. The code of CVTree software can be available at https://github.com/ghzuo/cvtree.
4.Jackknife and Bootstrap Tests of the Composition Vector Trees
Zuo GUANGHONG ; Xu ZHAO ; Yu HONGJIE ; Hao BAILIN
Genomics, Proteomics & Bioinformatics 2010;08(4):262-267
Composition vector trees(CVTrees)are inferred from whole-genome data by an alignment-free and parameter-free method.The agreement of these trees with the corresponding taxonomy provides an objective justification of the inferred phylogeny.In this work,we show the stability and self-consistency of CVTrees by performing bootstrap and jackknife re-sampling tests adapted to this alignment-free approach.Our ultimate goal is to advocate the viewpoint that time-consuming statistical re-sampling tests can be avoided at all in using this alignment-free approach.Agreement with taxonomy should be taken as a major criterion to estimate prokaryotic phylogenetic trees.
5.Polyphyly in 16S rRNA-based LVTree Versus Monophyly in Whole-genome-based CVTree.
Guanghong ZUO ; Ji QI ; Bailin HAO
Genomics, Proteomics & Bioinformatics 2018;16(5):310-319
We report an important but long-overlooked manifestation of low-resolution power of 16S rRNA sequence analysis at the species level, namely, in 16S rRNA-based phylogenetic trees polyphyletic placements of closely-related species are abundant compared to those in genome-based phylogeny. This phenomenon makes the demarcation of genera within many families ambiguous in the 16S rRNA-based taxonomy. In this study, we reconstructed phylogenetic relationship for more than ten thousand prokaryote genomes using the CVTree method, which is based on whole-genome information. And many such genera, which are polyphyletic in 16S rRNA-based trees, are well resolved as monophyletic clusters by CVTree. We believe that with genome sequencing of prokaryotes becoming a commonplace, genome-based phylogeny is doomed to play a definitive role in the construction of a natural and objective taxonomy.
Genome
;
Genomics
;
Phylogeny
;
RNA, Ribosomal, 16S
;
genetics
;
Sequence Analysis, DNA