1.A review of transformer models in drug discovery and beyond.
Jian JIANG ; Long CHEN ; Lu KE ; Bozheng DOU ; Chunhuan ZHANG ; Hongsong FENG ; Yueying ZHU ; Huahai QIU ; Bengong ZHANG ; Guo-Wei WEI
Journal of Pharmaceutical Analysis 2025;15(6):101081-101081
Transformer models have emerged as pivotal tools within the realm of drug discovery, distinguished by their unique architectural features and exceptional performance in managing intricate data landscapes. Leveraging the innate capabilities of transformer architectures to comprehend intricate hierarchical dependencies inherent in sequential data, these models showcase remarkable efficacy across various tasks, including new drug design and drug target identification. The adaptability of pre-trained transformer-based models renders them indispensable assets for driving data-centric advancements in drug discovery, chemistry, and biology, furnishing a robust framework that expedites innovation and discovery within these domains. Beyond their technical prowess, the success of transformer-based models in drug discovery, chemistry, and biology extends to their interdisciplinary potential, seamlessly combining biological, physical, chemical, and pharmacological insights to bridge gaps across diverse disciplines. This integrative approach not only enhances the depth and breadth of research endeavors but also fosters synergistic collaborations and exchange of ideas among disparate fields. In our review, we elucidate the myriad applications of transformers in drug discovery, as well as chemistry and biology, spanning from protein design and protein engineering, to molecular dynamics (MD), drug target identification, transformer-enabled drug virtual screening (VS), drug lead optimization, drug addiction, small data set challenges, chemical and biological image analysis, chemical language understanding, and single cell data. Finally, we conclude the survey by deliberating on promising trends in transformer models within the context of drug discovery and other sciences.
2.Mining of gene clusters for biosynthesis of secondary metabolites and analysis of genes encoding antibiotic resistance and virulence in 4 644 representative human gut strains.
Yeshi YIN ; Hu CHEN ; Meihong ZHANG ; Linyan CAO ; Huahai CHEN
Chinese Journal of Biotechnology 2022;38(10):3682-3694
Genome sequences of 4 644 representative strains from human gut microbiota were analyzed to mine gene clusters for biosynthesis of novel secondary metabolites, as well as genes encoding antibiotic resistance and virulence factors. AntiSMASH analysis showed that more than 60% of the representative strains encoded at least one secondary metabolite gene cluster, and 8 potential novel secondary metabolite gene clusters were identified from 8 unculturable bacteria. The secondary metabolite gene clusters in human intestine are mainly composed of nonribosomal peptide synthetase (NRPS), bacteriocin, arylpolyene, terpene, betalactone and NRPS like gene clusters distributed in Clostridia, Bacilli, Gammaproteobacteria, Bacteroidia, Actinobacteria and Negativicutes. PathoFact analysis showed that genes encoding antibiotic resistance and virulence factors are widely distributed in representative strains, but the frequency encoded by potential pathogens is significantly higher than that of non-potential pathogens. The frequency of genes encoding secretory toxins such as outer membrane protein, PapC N-terminal domain, PapC C-terminal domain, peptidase M16 inactive domain, and non-secretory toxins such as nitroreductase family, AcrB/AcrD/AcrF family, PLD-like domain, Cupin domain, putative hemolysin, S24-like peptidase, phosphotransferase enzyme family, endonuclease/ exonuclease/ phosphatase family, glyoxalase/ bleomycin resistance was high in potential pathogens. This study may facilitate mining new microbial natural products from the intestinal microbiome, understanding the colonization and infection mechanism of intestinal microorganisms, and providing targeted prevention and treatment of intestinal microbial related diseases.
Humans
;
Virulence
;
Multigene Family
;
Bacteria
;
Drug Resistance, Microbial
;
Virulence Factors
;
Peptide Hydrolases
3.Genetic polymorphisms of 21 non-combined of DNA index system short tandem repeat loci in Hainan Li population.
Tao LI ; Yaqing ZHANG ; Ying'ai ZHANG
Chinese Journal of Medical Genetics 2021;38(5):503-505
OBJECTIVE:
To investigate the genetic polymorphisms of 21 non-combined DNA index system short tandem repeat (STR) loci in Hainan Li population.
METHODS:
DNA samples from 339 unrelated healthy individuals of Li population from Hainan Province were extracted and amplified with fluorescence labeled multiplex PCR system. PCR products were electrophoresed on an ABI3130 Genetic Analyzer following the manufacturer's instructions. Allele designation was performed with a GeneMapper ID-X by comparison with the allele ladder provided by the corresponding kit.
RESULTS:
A total of 173 alleles and 489 genotypes were observed for the 21 STR loci, respectively. The frequencies of alleles and genotypes were 0.0010-0.5434 and 0.0020-0.3274, respectively. The heterozygosity varied from 0.639 to 0.833. Discrimination power (DP) was 0.803-0.948, power of exclusion for trio-paternity was 0.416-0.584, power of exclusion for duo-paternity was 0.140-0.238, the polymorphism information content(PIC) was 0.57-0.81, respectively. The total discrimination power (TDP), cumulative probability of exclusion for trio-paternity testing(CPE-trio) and cumulative probability of exclusion for duo-paternity testing (CPE-duo) were 0.999 999 999 999 99, 0.999 999 883 211 752, and 0.987 266, respectively.
CONCLUSION
The 21 STR loci are highly polymorphic and informative in the studied population and can be employed as supplementary loci in duo-paternity testing or cases with variant circumstances.
Asian Continental Ancestry Group/genetics*
;
China
;
DNA
;
Gene Frequency
;
Genetics, Population
;
Humans
;
Microsatellite Repeats/genetics*
;
Polymorphism, Genetic
4.An algorithm for three-dimensional plumonary parenchymal segmentation by integrating surfacelet transform with pulse coupled neural network.
Huahai ZHANG ; Peirui BAI ; Ziyang GUO ; Linghao DU ; Chang LI ; Yande REN ; Kai YANG ; Qingyi LIU
Journal of Biomedical Engineering 2020;37(4):630-640
In order to overcome the difficulty in lung parenchymal segmentation due to the factors such as lung disease and bronchial interference, a segmentation algorithm for three-dimensional lung parenchymal is presented based on the integration of surfacelet transform and pulse coupled neural network (PCNN). First, the three-dimensional computed tomography of lungs is decomposed into surfacelet transform domain to obtain multi-scale and multi-directional sub-band information. The edge features are then enhanced by filtering sub-band coefficients using local modified Laplacian operator. Second, surfacelet inverse transform is implemented and the reconstructed image is fed back to the input of PCNN. Finally, iteration process of the PCNN is carried out to obtain final segmentation result. The proposed algorithm is validated on the samples of public dataset. The experimental results demonstrate that the proposed algorithm has superior performance over that of the three-dimensional surfacelet transform edge detection algorithm, the three-dimensional region growing algorithm, and the three-dimensional U-NET algorithm. It can effectively suppress the interference coming from lung lesions and bronchial, and obtain a complete structure of lung parenchyma.
Algorithms
;
Neural Networks, Computer
;
Tomography, X-Ray Computed
5.Correlation analysis of uric albumin/uric creatinine ratio with NEW-TOAST different types in acute cerebral infarction
Jingjuan CHEN ; Chengguo ZHANG ; Guode LI ; Guanglun ZENG ; Piao DU ; Guohua ZHANG ; Huahai FENG
Chinese Journal of Neuromedicine 2014;13(8):799-802
Objective To detect the urinary albumin level and urinary albumin/urine creatinine ratio in patients with acute cerebral infarction and explore their relations with NEW-TOAST typing.Methods One hundred and sixty-eight patients with acute cerebral infarction,admitted to our hospital from March 2011 to March 201,were chosen in our study; and other 45 healthy subjects were used as controls; according to NEW-TOAST typing,the patients were divided into different subgroups.Their clinical data were retrospectively analyzed; the 24 hour urinary albumin level and urinary albumin/urine creatinine ratio were detected and their relation was analyzed between patient group and controls,and between patients of different subtypes; besides,the correlation of neurologic impairment (NIHSS) scores with urinary albumin/urine creatinine ratio was analyzed.Results The 24 hour urinary albumin level and urinary albumin/urine creatinine ratio was positively correlated (r=0.301,P=0.001); according to the NEW-TOAST subtypes,patients with large artery atherosclerosis and small artery occlusion had significantly higher level of 24 hour urinary albumin level and urinary albumin/urine creatinine ratio (P<0.05).NIHSS scores and urinary albumin/creatinine ratio in patient group were positively correlated (r=0.215,P=0.001).Conclusion Acute cerebral infarction and kidney disease are closely correlated;both 24 hour urinary albumin level and urinary albumin/urine creatinine ratio can be the predictor of acute cerebral infarction and influence the prognosis.

Result Analysis
Print
Save
E-mail