1.PrimerSeq:Design and Visualization of RT-PCR Primers for Alternative Splicing Using RNA-seq Data
Tokheim COLLIN ; Park Won JUW ; Xing YI
Genomics, Proteomics & Bioinformatics 2014;(2):105-109
The vast majority of multi-exon genes in higher eukaryotes are alternatively spliced and changes in alternative splicing (AS) can impact gene function or cause disease. High-throughput RNA sequencing (RNA-seq) has become a powerful technology for transcriptome-wide analysis of AS, but RT-PCR still remains the gold-standard approach for quantifying and validating exon splicing levels. We have developed PrimerSeq, a user-friendly software for systematic design and visualization of RT-PCR primers using RNA-seq data. PrimerSeq incorporates user-provided tran-scriptome profiles (i.e., RNA-seq data) in the design process, and is particularly useful for large-scale quantitative analysis of AS events discovered from RNA-seq experiments. PrimerSeq features a graphical user interface (GUI) that displays the RNA-seq data juxtaposed with the expected RT-PCR results. To enable primer design and visualization on user-provided RNA-seq data and transcript annotations, we have developed PrimerSeq as a stand-alone software that runs on local computers. PrimerSeq is freely available for Windows and Mac OS X along with source code at http://primerseq.sourceforge.net/. With the growing popularity of RNA-seq for transcriptome stud-ies, we expect PrimerSeq to help bridge the gap between high-throughput RNA-seq discovery of AS events and molecular analysis of candidate events by RT-PCR.
2.Machine Learning Modeling of Protein-intrinsic Features Predicts Tractability of Targeted Protein Degradation
Zhang WUBING ; Burman S.Roy SHOURYA ; Chen JIAYE ; A.Donovan KATHERINE ; Cao YANG ; Shu CHELSEA ; Zhang BONING ; Zeng ZEXIAN ; Gu SHENGQING ; Zhang YI ; Li DIAN ; S.Fischer ERIC ; Tokheim COLLIN ; Liu X.SHIRLEY
Genomics, Proteomics & Bioinformatics 2022;20(5):882-898
Targeted protein degradation(TPD)has rapidly emerged as a therapeutic modality to eliminate previously undruggable proteins by repurposing the cell's endogenous protein degrada-tion machinery.However,the susceptibility of proteins for targeting by TPD approaches,termed"degradability",is largely unknown.Here,we developed a machine learning model,model-free anal-ysis of protein degradability(MAPD),to predict degradability from features intrinsic to protein tar-gets.MAPD shows accurate performance in predicting kinases that are degradable by TPD compounds[with an area under the precision-recall curve(AUPRC)of 0.759 and an area under the receiver operating characteristic curve(AUROC)of 0.775]and is likely generalizable to inde-pendent non-kinase proteins.We found five features with statistical significance to achieve optimal prediction,with ubiquitination potential being the most predictive.By structural modeling,we found that E2-accessible ubiquitination sites,but not lysine residues in general,are particularly associated with kinase degradability.Finally,we extended MAPD predictions to the entire proteome to find 964 disease-causing proteins(including proteins encoded by 278 cancer genes)that may be tractable to TPD drug development.