1.GPT2-ICC: A data-driven approach for accurate ion channel identification using pre-trained large language models.
Zihan ZHOU ; Yang YU ; Chengji YANG ; Leyan CAO ; Shaoying ZHANG ; Junnan LI ; Yingnan ZHANG ; Huayun HAN ; Guoliang SHI ; Qiansen ZHANG ; Juwen SHEN ; Huaiyu YANG
Journal of Pharmaceutical Analysis 2025;15(8):101302-101302
Current experimental and computational methods have limitations in accurately and efficiently classifying ion channels within vast protein spaces. Here we have developed a deep learning algorithm, GPT2 Ion Channel Classifier (GPT2-ICC), which effectively distinguishing ion channels from a test set containing approximately 239 times more non-ion-channel proteins. GPT2-ICC integrates representation learning with a large language model (LLM)-based classifier, enabling highly accurate identification of potential ion channels. Several potential ion channels were predicated from the unannotated human proteome, further demonstrating GPT2-ICC's generalization ability. This study marks a significant advancement in artificial-intelligence-driven ion channel research, highlighting the adaptability and effectiveness of combining representation learning with LLMs to address the challenges of imbalanced protein sequence data. Moreover, it provides a valuable computational tool for uncovering previously uncharacterized ion channels.

Result Analysis
Print
Save
E-mail