1.Role of Long Non-coding RNAs in Reprogramming to Induced Pluripotency.
Shahzina KANWAL ; Xiangpeng GUO ; Carl WARD ; Giacomo VOLPE ; Baoming QIN ; Miguel A ESTEBAN ; Xichen BAO
Genomics, Proteomics & Bioinformatics 2020;18(1):16-25
The generation of induced pluripotent stem cells through somatic cell reprogramming requires a global reorganization of cellular functions. This reorganization occurs in a multi-phased manner and involves a gradual revision of both the epigenome and transcriptome. Recent studies have shown that the large-scale transcriptional changes observed during reprogramming also apply to long non-coding RNAs (lncRNAs), a type of traditionally neglected RNA species that are increasingly viewed as critical regulators of cellular function. Deeper understanding of lncRNAs in reprogramming may not only help to improve this process but also have implications for studying cell plasticity in other contexts, such as development, aging, and cancer. In this review, we summarize the current progress made in profiling and analyzing the role of lncRNAs in various phases of somatic cell reprogramming, with emphasis on the re-establishment of the pluripotency gene network and X chromosome reactivation.
3.SinoDuplex: An Improved Duplex Sequencing Approach to Detect Low-frequency Variants in Plasma cfDNA Samples.
Yongzhe REN ; Yang ZHANG ; Dandan WANG ; Fengying LIU ; Ying FU ; Shaohua XIANG ; Li SU ; Jiancheng LI ; Heng DAI ; Bingding HUANG
Genomics, Proteomics & Bioinformatics 2020;18(1):81-90
Accurate detection of low frequency mutations from plasma cell-free DNA in blood using targeted next generation sequencing technology has shown promising benefits in clinical settings. Duplex sequencing technology is the most commonly used approach in liquid biopsies. Unique molecular identifiers are attached to each double-stranded DNA template, followed by production of low-error consensus sequences to detect low frequency variants. However, high sequencing costs have hindered application of this approach in clinical practice. Here, we have developed an improved duplex sequencing approach called SinoDuplex, which utilizes a pool of adapters containing pre-defined barcode sequences to generate far fewer barcode combinations than with random sequences, and implemented a novel computational analysis algorithm to generate duplex consensus sequences more precisely. SinoDuplex increased the output of duplex sequencing technology, making it more cost-effective. We evaluated our approach using reference standard samples and cell-free DNA samples from lung cancer patients. Our results showed that SinoDuplex has high sensitivity and specificity in detecting very low allele frequency mutations. The source code for SinoDuplex is freely available at https://github.com/SinOncology/sinoduplex.
4.CRISPR Screens Identify Essential Cell Growth Mediators in BRAF Inhibitor-resistant Melanoma.
Ziyi LI ; Binbin WANG ; Shengqing GU ; Peng JIANG ; Avinash SAHU ; Chen-Hao CHEN ; Tong HAN ; Sailing SHI ; Xiaoqing WANG ; Nicole TRAUGH ; Hailing LIU ; Yin LIU ; Qiu WU ; Myles BROWN ; Tengfei XIAO ; Genevieve M BOLAND ; X SHIRLEY LIU
Genomics, Proteomics & Bioinformatics 2020;18(1):26-40
BRAF is a serine/threonine kinase that harbors activating mutations in ∼7% of human malignancies and ∼60% of melanomas. Despite initial clinical responses to BRAF inhibitors, patients frequently develop drug resistance. To identify candidate therapeutic targets for BRAF inhibitor resistant melanoma, we conduct CRISPR screens in melanoma cells harboring an activating BRAF mutation that had also acquired resistance to BRAF inhibitors. To investigate the mechanisms and pathways enabling resistance to BRAF inhibitors in melanomas, we integrate expression, ATAC-seq, and CRISPR screen data. We identify the JUN family transcription factors and the ETS family transcription factor ETV5 as key regulators of CDK6, which together enable resistance to BRAF inhibitors in melanoma cells. Our findings reveal genes contributing to resistance to a selective BRAF inhibitor PLX4720, providing new insights into gene regulation in BRAF inhibitor resistant melanoma cells.
5.Procleave: Predicting Protease-specific Substrate Cleavage Sites by Combining Sequence and Structural Information.
Fuyi LI ; Andre LEIER ; Quanzhong LIU ; Yanan WANG ; Dongxu XIANG ; Tatsuya AKUTSU ; Geoffrey I WEBB ; A Ian SMITH ; Tatiana MARQUEZ-LAGO ; Jian LI ; Jiangning SONG
Genomics, Proteomics & Bioinformatics 2020;18(1):52-64
Proteases are enzymes that cleave and hydrolyse the peptide bonds between two specific amino acid residues of target substrate proteins. Protease-controlled proteolysis plays a key role in the degradation and recycling of proteins, which is essential for various physiological processes. Thus, solving the substrate identification problem will have important implications for the precise understanding of functions and physiological roles of proteases, as well as for therapeutic target identification and pharmaceutical applicability. Consequently, there is a great demand for bioinformatics methods that can predict novel substrate cleavage events with high accuracy by utilizing both sequence and structural information. In this study, we present Procleave, a novel bioinformatics approach for predicting protease-specific substrates and specific cleavage sites by taking into account both their sequence and 3D structural information. Structural features of known cleavage sites were represented by discrete values using a LOWESS data-smoothing optimization method, which turned out to be critical for the performance of Procleave. The optimal approximations of all structural parameter values were encoded in a conditional random field (CRF) computational framework, alongside sequence and chemical group-based features. Here, we demonstrate the outstanding performance of Procleave through extensive benchmarking and independent tests. Procleave is capable of correctly identifying most cleavage sites in the case study. Importantly, when applied to the human structural proteome encompassing 17,628 protein structures, Procleave suggests a number of potential novel target substrates and their corresponding cleavage sites of different proteases. Procleave is implemented as a webserver and is freely accessible at http://procleave.erc.monash.edu/.
6.The Elements of Data Sharing.
Zhang ZHANG ; Shuhui SONG ; Jun YU ; Wenming ZHAO ; Jingfa XIAO ; Yiming BAO
Genomics, Proteomics & Bioinformatics 2020;18(1):1-4
7.Glycoproteogenomics: Setting the Course for Next-generation Cancer Neoantigen Discovery for Cancer Vaccines.
José Alexandre FERREIRA ; Marta RELVAS-SANTOS ; Andreia PEIXOTO ; André M N SILVA ; Lúcio LARA SANTOS
Genomics, Proteomics & Bioinformatics 2021;19(1):25-43
Molecular-assisted precision oncology gained tremendous ground with high-throughput next-generation sequencing (NGS), supported by robust bioinformatics. The quest for genomics-based cancer medicine set the foundations for improved patient stratification, while unveiling a wide array of neoantigens for immunotherapy. Upfront pre-clinical and clinical studies have successfully used tumor-specific peptides in vaccines with minimal off-target effects. However, the low mutational burden presented by many lesions challenges the generalization of these solutions, requiring the diversification of neoantigen sources. Oncoproteogenomics utilizing customized databases for protein annotation by mass spectrometry (MS) is a powerful tool toward this end. Expanding the concept toward exploring proteoforms originated from post-translational modifications (PTMs) will be decisive to improve molecular subtyping and provide potentially targetable functional nodes with increased cancer specificity. Walking through the path of systems biology, we highlight that alterations in protein glycosylation at the cell surface not only have functional impact on cancer progression and dissemination but also originate unique molecular fingerprints for targeted therapeutics. Moreover, we discuss the outstanding challenges required to accommodate glycoproteomics in oncoproteogenomics platforms. We envisage that such rationale may flag a rather neglected research field, generating novel paradigms for precision oncology and immunotherapy.
8.Characterization of Lysine Monomethylome and Methyltransferase in Model Cyanobacterium Synechocystis sp. PCC 6803.
Xiaohuang LIN ; Mingkun YANG ; Xin LIU ; Zhongyi CHENG ; Feng GE
Genomics, Proteomics & Bioinformatics 2020;18(3):289-304
Protein lysine methylation is a prevalent post-translational modification (PTM) and plays critical roles in all domains of life. However, its extent and function in photosynthetic organisms are still largely unknown. Cyanobacteria are a large group of prokaryotes that carry out oxygenic photosynthesis and are applied extensively in studies of photosynthetic mechanisms and environmental adaptation. Here we integrated propionylation of monomethylated proteins, enrichment of the modified peptides, and mass spectrometry (MS) analysis to identify monomethylated proteins in Synechocystis sp. PCC 6803 (Synechocystis). Overall, we identified 376 monomethylation sites in 270 proteins, with numerous monomethylated proteins participating in photosynthesis and carbon metabolism. We subsequently demonstrated that CpcM, a previously identified asparagine methyltransferase in Synechocystis, could catalyze lysine monomethylation of the potential aspartate aminotransferase Sll0480 both in vivo and in vitro and regulate the enzyme activity of Sll0480. The loss of CpcM led to decreases in the maximum quantum yield in primary photosystem II (PSII) and the efficiency of energy transfer during the photosynthetic reaction in Synechocystis. We report the first lysine monomethylome in a photosynthetic organism and present a critical database for functional analyses of monomethylation in cyanobacteria. The large number of monomethylated proteins and the identification of CpcM as the lysine methyltransferase in cyanobacteria suggest that reversible methylation may influence the metabolic process and photosynthesis in both cyanobacteria and plants.
Bacterial Proteins/metabolism*
;
Lysine/metabolism*
;
Methyltransferases/metabolism*
;
Photosynthesis
;
Protein Processing, Post-Translational
;
Synechocystis/growth & development*
9.Genome Size Evolution Mediated by Gypsy Retrotransposons in Brassicaceae.
Shi-Jian ZHANG ; Lei LIU ; Ruolin YANG ; Xiangfeng WANG
Genomics, Proteomics & Bioinformatics 2020;18(3):321-332
The dynamic activity of transposable elements (TEs) contributes to the vast diversity of genome size and architecture among plants. Here, we examined the genomic distribution and transposition activity of long terminal repeat retrotransposons (LTR-RTs) in Arabidopsis thaliana (Ath) and three of its relatives, Arabidopsis lyrata (Aly), Eutrema salsugineum (Esa), and Schrenkiella parvula (Spa), in Brassicaceae. Our analyses revealed the distinct evolutionary dynamics of Gypsyretrotransposons, which reflects the different patterns of genome size changes of the four species over the past million years. The rate of Gypsy transposition in Aly is approximately five times more rapid than that of Ath and Esa, suggesting an expanding Aly genome. Gypsy insertions in Esa are strictly confined to pericentromeric heterochromatin and associated with dramatic centromere expansion. In contrast, Gypsy insertions in Spa have been largely suppressed over the last million years, likely as a result of a combination of an inherent molecular mechanism of preferential DNA removal and purifying selection at Gypsy elements. Additionally, species-specific clades of Gypsy elements shaped the distinct genome architectures of Aly and Esa.
Brassicaceae/genetics*
;
Evolution, Molecular
;
Genome Size
;
Genome, Plant
;
Genomics
;
Phylogeny
;
Retroelements
;
Species Specificity
10.Ubiquitinome Profiling Reveals the Landscape of Ubiquitination Regulation in Rice Young Panicles.
Liya ZHU ; Han CHENG ; Guoqing PENG ; Shuansuo WANG ; Zhiguo ZHANG ; Erdong NI ; Xiangdong FU ; Chuxiong ZHUANG ; Zexian LIU ; Hai ZHOU
Genomics, Proteomics & Bioinformatics 2020;18(3):305-320
Ubiquitination, an essential post-transcriptional modification (PTM), plays a vital role in nearly every biological process, including development and growth. Despite its functions in plant reproductive development, its targets in rice panicles remain unclear. In this study, we used proteome-wide profiling of lysine ubiquitination in rice (O. sativa ssp. indica) young panicles. We created the largest ubiquitinome dataset in rice to date, identifying 1638 lysine ubiquitination sites on 916 unique proteins. We detected three conserved ubiquitination motifs, noting that acidic glutamic acid (E) and aspartic acid (D) were most frequently present around ubiquitinated lysine. Enrichment analysis of Gene Ontology (GO) annotations and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways of these ubiquitinated proteins revealed that ubiquitination plays an important role in fundamental cellular processes in rice young panicles. Interestingly, enrichment analysis of protein domains indicated that ubiquitination was enriched on a variety of receptor-like kinases and cytoplasmic tyrosine and serine-threonine kinases. Furthermore, we analyzed the crosstalk between ubiquitination, acetylation, and succinylation, and constructed a potential protein interaction network within our rice ubiquitinome. Moreover, we identified ubiquitinated proteins related to pollen and grain development, indicating that ubiquitination may play a critical role in the physiological functions in young panicles. Taken together, we reported the most comprehensive lysine ubiquitinome in rice so far, and used it to reveal the functional role of lysine ubiquitination in rice young panicles.
Acetylation
;
Lysine/metabolism*
;
Oryza/metabolism*
;
Plant Proteins/metabolism*
;
Protein Interaction Maps
;
Protein Processing, Post-Translational
;
Proteome/metabolism*
;
Ubiquitin/metabolism*
;
Ubiquitination