1.Machine Learning Modeling of Protein-intrinsic Features Predicts Tractability of Targeted Protein Degradation
Zhang WUBING ; Burman S.Roy SHOURYA ; Chen JIAYE ; A.Donovan KATHERINE ; Cao YANG ; Shu CHELSEA ; Zhang BONING ; Zeng ZEXIAN ; Gu SHENGQING ; Zhang YI ; Li DIAN ; S.Fischer ERIC ; Tokheim COLLIN ; Liu X.SHIRLEY
Genomics, Proteomics & Bioinformatics 2022;20(5):882-898
Targeted protein degradation(TPD)has rapidly emerged as a therapeutic modality to eliminate previously undruggable proteins by repurposing the cell's endogenous protein degrada-tion machinery.However,the susceptibility of proteins for targeting by TPD approaches,termed"degradability",is largely unknown.Here,we developed a machine learning model,model-free anal-ysis of protein degradability(MAPD),to predict degradability from features intrinsic to protein tar-gets.MAPD shows accurate performance in predicting kinases that are degradable by TPD compounds[with an area under the precision-recall curve(AUPRC)of 0.759 and an area under the receiver operating characteristic curve(AUROC)of 0.775]and is likely generalizable to inde-pendent non-kinase proteins.We found five features with statistical significance to achieve optimal prediction,with ubiquitination potential being the most predictive.By structural modeling,we found that E2-accessible ubiquitination sites,but not lysine residues in general,are particularly associated with kinase degradability.Finally,we extended MAPD predictions to the entire proteome to find 964 disease-causing proteins(including proteins encoded by 278 cancer genes)that may be tractable to TPD drug development.