1.Compositional Variability and MutationSpectra of Monophyletic SARS-CoV-2 Clades
Teng XUFEI ; Li QIANPENG ; Li ZHAO ; Zhang YUANSHENG ; Niu GUANGYI ; Xiao JINGFA ; Yu JUN ; Zhang ZHANG ; Song SHUHUI
Genomics, Proteomics & Bioinformatics 2020;18(6):648-663
COVID-19 and its causative pathogen SARS-CoV-2 have rushed the world into a stag-gering pandemic in a few months, and a global fight against both has been intensifying. Here, we describe an analysis procedure where genome composition and its variables are related, through the genetic code to molecular mechanisms, based on understanding of RNA replication and its feed-back loop from mutation to viral proteome sequence fraternity including effective sites on the replicase-transcriptase complex. Our analysis starts with primary sequence information, identity-based phylogeny based on 22,051 SARS-CoV-2 sequences, and evaluation of sequence variation patterns as mutation spectra and its 12 permutations among organized clades. All are tailored to two key mechanisms: strand-biased and function-associated mutations. Our findings are listed as follows: 1) The most dominant mutation is C-to-U permutation, whose abundant second-codon-position counts alter amino acid composition toward higher molecular weight and lower hydropho-bicity, albeit assumed most slightly deleterious. 2) The second abundance group includes three negative-strand mutations (U-to-C, A-to-G, and G-to-A) and a positive-strand mutation (G-to-U) due to DNA repair mechanisms after cellular abasic events. 3) A clade-associated biased muta-tion trend is found attributable to elevated level of negative-sense strand synthesis. 4) Within-clade permutation variation is very informative for associating non-synonymous mutations and viral pro-teome changes. These findings demand a platform where emerging mutations are mapped onto mostly subtle but fast-adjusting viral proteomes and transcriptomes, to provide biological and clinical information after logical convergence for effective pharmaceutical and diagnostic applica-tions. Such actions are in desperate need, especially in the middle of the War against COVID-19.
2. Research progress of lipolipomics in primary hepatocellular carcinoma
Xiaoju SHI ; Qianqian ZHENG ; Junqi NIU ; Guoyue LYU ; Xingkai LIU ; Guangyi WANG
Chinese Journal of Hepatology 2019;27(10):809-812
Presently, nonalcoholic fatty liver disease has become the most common pathogenic factor of chronic liver disease worldwide that can lead to the occurrence of hepatocellular carcinoma (HCC). Lipid metabolism in cancer cells is closely related to tumorgenesis, invasion and metastasis, and thus acts as one of the hallmark of cancer cells. Lipolipomics is an important branch of metabolomics, which has been adapted recently in the study of HCC for analysis of the structure and function of lipid components by chromatography and mass spectrometry. Fatty acids, glycerides, glycerophospholipids, sphingolipids, and sterol are significantly different in HCC tissues or serum. Therefore, it contributes to the diagnosis, determination of prognosis, mechanistic study and targeted therapy of HCC.
3.IC4R-2.0:Rice Genome Reannotation Using Massive RNA-seq Data
Sang JIAN ; Zou DONG ; Wang ZHENNAN ; Wang FAN ; Zhang YUANSHENG ; Xia LIN ; Li ZHAOHUA ; Ma LINA ; Li MENGWEI ; Xu BINGXIANG ; Liu XIAONAN ; Wu SHUANGYANG ; Liu LIN ; Niu GUANGYI ; Li MAN ; Luo YINGFENG ; Hu SONGNIAN ; Hao LILI ; Zhang ZHANG
Genomics, Proteomics & Bioinformatics 2020;18(2):161-172
Genome reannotation aims for complete and accurate characterization of gene models and thus is of critical significance for in-depth exploration of gene function. Although the availability of massive RNA-seq data provides great opportunities for gene model refinement, few efforts have been made to adopt these precious data in rice genome reannotation. Here we reannotate the rice (Oryza sativa L. ssp. japonica) genome based on integration of large-scale RNA-seq data and release a new annotation system IC4R-2.0. In general, IC4R-2.0 significantly improves the completeness of gene structure, identifies a number of novel genes, and integrates a variety of functional annota-tions. Furthermore, long non-coding RNAs (lncRNAs) and circular RNAs (circRNAs) are system-atically characterized in the rice genome. Performance evaluation shows that compared to previous annotation systems, IC4R-2.0 achieves higher integrity and quality, primarily attributable to mas-sive RNA-seq data applied in genome annotation. Consequently, we incorporate the improvedannotations into the Information Commons for Rice (IC4R), a database integrating multiple omics data of rice, and accordingly update IC4R by providing more user-friendly web interfaces and implementing a series of practical online tools. Together, the updated IC4R, which is equipped with the improved annotations, bears great promise for comparative and functional genomic studies in rice and other monocotyledonous species. The IC4R-2.0 annotation system and related resources are freely accessible at http://ic4r.org/.