GPT2-ICC:A data-driven approach for accurate ion channel identification using pre-trained large language models
10.1016/j.jpha.2025.101302
- Author:
Zihan ZHOU
1
;
Yang YU
;
Chengji YANG
;
Leyan CAO
;
Shaoying ZHANG
;
Junnan LI
;
Yingnan ZHANG
;
Huayun HAN
;
Guoliang SHI
;
Qiansen ZHANG
;
Juwen SHEN
;
Huaiyu YANG
Author Information
1. Shanghai Key Laboratory of Regulatory Biology,Institute of Biomedical Sciences and School of Life Sciences,East China Normal University,Shanghai,200241,China
- Publication Type:Journal Article
- Keywords:
Ion channel;
Artificial intelligence;
Representation learning;
GPT2;
Protein language model
- From:
Journal of Pharmaceutical Analysis
2025;15(8):1800-1809
- CountryChina
- Language:English
-
Abstract:
Current experimental and computational methods have limitations in accurately and efficiently classi-fying ion channels within vast protein spaces.Here we have developed a deep learning algorithm,GPT2 Ion Channel Classifier(GPT2-ICC),which effectively distinguishing ion channels from a test set con-taining approximately 239 times more non-ion-channel proteins.GPT2-ICC integrates representation learning with a large language model(LLM)-based classifier,enabling highly accurate identification of potential ion channels.Several potential ion channels were predicated from the unannotated human proteome,further demonstrating GPT2-ICC's generalization ability.This study marks a significant advancement in artificial-intelligence-driven ion channel research,highlighting the adaptability and effectiveness of combining representation learning with LLMs to address the challenges of imbalanced protein sequence data.Moreover,it provides a valuable computational tool for uncovering previously uncharacterized ion channels.