1.GliomaDB:A Web Server for Integrating Glioma Omics Data and Interactive Analysis
Yang YADONG ; Sui YANG ; Xie BINGBING ; Qu HONGZHU ; Fang XIANGDONG
Genomics, Proteomics & Bioinformatics 2019;17(4):465-471
Gliomas are one of the most common types of brain cancers. Numerous efforts have been devoted to studying the mechanisms of glioma genesis and identifying biomarkers for diagnosis and treatment. To help further investigations, we present a comprehensive database named GliomaDB. GliomaDB includes 21,086 samples from 4303 patients and integrates genomic, transcriptomic, epigenomic, clinical, and gene-drug association data regarding glioblastoma multiforme (GBM) and low-grade glioma (LGG) from The Cancer Genome Atlas (TCGA), Gene Expression Omnibus (GEO), the Chinese Glioma Genome Atlas (CGGA), the Memorial Sloan Kettering Cancer Center Integrated Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT), the US Food and Drug Administration (FDA), and PharmGKB. GliomaDB offers a user-friendly interface for two main types of functionalities. The first comprises queries of (i) somatic mutations, (ii) gene expression, (iii) microRNA (miRNA) expression, and (iv) DNA methylation. In addition, queries can be executed at the gene, region, and base level. Second, GliomaDB allows users to perform survival analysis, coexpression network visualization, multi-omics data visualization, and targeted drug recommendations based on personalized variations. GliomaDB bridges the gap between glioma genomics big data and the delivery of integrated information for end users, thus enabling both researchers and clinicians to effectively use publicly available data and empowering the progression of precision medicine in glioma. GliomaDB is freely accessible at http://bigd.big.ac.cn/gliomaDB.
2.Biological Databases for Hematology Research
Zhang QIAN ; Ding NAN ; Zhang LU ; Zhao XUETONG ; Yang YADONG ; Qu HONGZHU ; Fang XIANGDONG
Genomics, Proteomics & Bioinformatics 2016;14(6):333-337
With the advances of genome-wide sequencing technologies and bioinformatics approaches, a large number of datasets of normal and malignant erythropoiesis have been gener-ated and made public to researchers around the world. Collection and integration of these datasets greatly facilitate basic research and clinical diagnosis and treatment of blood disorders. Here we provide a brief introduction of the most popular omics data resources of normal and malignant hematopoiesis, including some integrated web tools, to help users get better equipped to perform common analyses. We hope this review will promote the awareness and facilitate the usage of public database resources in the hematology research.
3.Databases and Web Tools for Cancer Genomics Study
Yang YADONG ; Dong XUNONG ; Xie BINGBING ; Ding NAN ; Chen JUAN ; Li YONGJUN ; Zhang QIAN ; Qu HONGZHU ; Fang XIANGDONG
Genomics, Proteomics & Bioinformatics 2015;(1):46-50
Publicly-accessible resources have promoted the advance of scientific discovery. The era of genomics and big data has brought the need for collaboration and data sharing in order to make effective use of this new knowledge. Here, we describe the web resources for cancer genomics research and rate them on the basis of the diversity of cancer types, sample size, omics data com-prehensiveness, and user experience. The resources reviewed include data repository and analysis tools;and we hope such introduction will promote the awareness and facilitate the usage of these resources in the cancer research community.
4.Common Postzygotic Mutational Signatures in Healthy Adult Tissues Related to Embryonic Hypoxia
Hong YAQIANG ; Zhang DAKE ; Zhou XIANGTIAN ; Chen AILI ; Abliz AMIR ; Bai JIAN ; Wang LIANG ; Hu QINGTAO ; Gong KENAN ; Guan XIAONAN ; Liu MENGFEI ; Zheng XINCHANG ; Lai SHUJUAN ; Qu HONGZHU ; Zhao FUXIN ; Hao SHUANG ; Wu ZHEN ; Cai HONG ; Hu SHAOYAN ; Ma YUE ; Zhang JUNTING ; Ke YANG ; Wang QIAN-FEI ; Chen WEI ; Zeng CHANGQING
Genomics, Proteomics & Bioinformatics 2022;20(1):177-191
Postzygotic mutations are acquired in normal tissues throughout an individual's lifetime and hold clues for identifying mutagenic factors.Here,we investigated postzygotic mutation spectra of healthy individuals using optimized ultra-deep exome sequencing of the time-series samples from the same volunteer as well as the samples from different individuals.In blood,sperm,and muscle cells,we resolved three common types of mutational signatures.Signatures A and B represent clock-like mutational processes,and the polymorphisms of epigenetic regulation genes influence the pro-portion of signature B in mutation profiles.Notably,signature C,characterized by C>T transitions at GpCpN sites,tends to be a feature of diverse normal tissues.Mutations of this type are likely to occur early during embryonic development,supported by their relatively high allelic frequencies,presence in multiple tissues,and decrease in occurrence with age.Almost none of the public datasets for tumors feature this signature,except for 19.6%of samples of clear cell renal cell carcinoma with increased activation of the hypoxia-inducible factor 1(HIF-1)signaling pathway.Moreover,the accumulation of signature C in the mutation profile was accelerated in a human embryonic stem cell line with drug-induced activation of HIF-1α.Thus,embryonic hypoxia may explain this novel signature across multiple normal tissues.Our study suggests that hypoxic condition in an early stage of embryonic development is a crucial factor inducing C>T transitions at GpCpN sites;and indi-viduals'genetic background may also influence their postzygotic mutation profiles.
5.Whole Genome Analyses of Chinese Population and De Novo Assembly of A Northern Han Genome.
Zhenglin DU ; Liang MA ; Hongzhu QU ; Wei CHEN ; Bing ZHANG ; Xi LU ; Weibo ZHAI ; Xin SHENG ; Yongqiao SUN ; Wenjie LI ; Meng LEI ; Qiuhui QI ; Na YUAN ; Shuo SHI ; Jingyao ZENG ; Jinyue WANG ; Yadong YANG ; Qi LIU ; Yaqiang HONG ; Lili DONG ; Zhewen ZHANG ; Dong ZOU ; Yanqing WANG ; Shuhui SONG ; Fan LIU ; Xiangdong FANG ; Hua CHEN ; Xin LIU ; Jingfa XIAO ; Changqing ZENG
Genomics, Proteomics & Bioinformatics 2019;17(3):229-247
To unravel the genetic mechanisms of disease and physiological traits, it requires comprehensive sequencing analysis of large sample size in Chinese populations. Here, we report the primary results of the Chinese Academy of Sciences Precision Medicine Initiative (CASPMI) project launched by the Chinese Academy of Sciences, including the de novo assembly of a northern Han reference genome (NH1.0) and whole genome analyses of 597 healthy people coming from most areas in China. Given the two existing reference genomes for Han Chinese (YH and HX1) were both from the south, we constructed NH1.0, a new reference genome from a northern individual, by combining the sequencing strategies of PacBio, 10× Genomics, and Bionano mapping. Using this integrated approach, we obtained an N50 scaffold size of 46.63 Mb for the NH1.0 genome and performed a comparative genome analysis of NH1.0 with YH and HX1. In order to generate a genomic variation map of Chinese populations, we performed the whole-genome sequencing of 597 participants and identified 24.85 million (M) single nucleotide variants (SNVs), 3.85 M small indels, and 106,382 structural variations. In the association analysis with collected phenotypes, we found that the T allele of rs1549293 in KAT8 significantly correlated with the waist circumference in northern Han males. Moreover, significant genetic diversity in MTHFR, TCN2, FADS1, and FADS2, which associate with circulating folate, vitamin B12, or lipid metabolism, was observed between northerners and southerners. Especially, for the homocysteine-increasing allele of rs1801133 (MTHFR 677T), we hypothesize that there exists a "comfort" zone for a high frequency of 677T between latitudes of 35-45 degree North. Taken together, our results provide a high-quality northern Han reference genome and novel population-specific data sets of genetic variants for use in the personalized and precision medicine.