1.Comparison of Normalization Methods for Defining Copy Number Variation Using Whole-genome SNP Genotyping Data.
Ji Hong KIM ; Seon Hee YIM ; Yong Bok JEONG ; Seong Hyun JUNG ; Hai Dong XU ; Seung Hun SHIN ; Yeun Jun CHUNG
Genomics & Informatics 2008;6(4):231-234
Precise and reliable identification of CNV is still important to fully understand the effect of CNV on genetic diversity and background of complex diseases. SNP marker has been used frequently to detect CNVs, but the analysis of SNP chip data for identifying CNV has not been well established. We compared various normalization methods for CNV analysis and suggest optimal normalization procedure for reliable CNV call. Four normal Koreans and NA10851 HapMap male samples were genotyped using Affymetrix Genome-Wide Human SNP array 5.0. We evaluated the effect of median and quantile normalization to find the optimal normalization for CNV detection based on SNP array data. We also explored the effect of Robust Multichip Average (RMA) background correction for each normalization process. In total, the following 4 combinations of normalization were tried: 1) Median normalization without RMA background correction, 2) Quantile normalization without RMA background correction, 3) Median normalization with RMA background correction, and 4) Quantile ormalization with RMA background correction. CNV was called using SW-ARRAY algorithm. We applied 4 different combinations of normalization and compared the effect using intensity ratio profile, box plot, and MA plot. When we applied median and quantile normalizations without RMA background correction, both methods showed similar normalization effect and the final CNV calls were also similar in terms of number and size. In both median and quantile normalizations, RMA background correction resulted in widening the range of intensity ratio distribution, which may suggest that RMA background correction may help to detect more CNVs compared to no correction.
Coat Protein Complex I
;
Genetic Variation
;
HapMap Project
;
Humans
;
Male
2.Computational Approach for Biosynthetic Engineering of Post-PKS Tailoring Enzymes.
Genomics & Informatics 2008;6(4):227-230
Compounds of polyketide origin possess a wealth of pharmacological effects, including antibacterial, antifungal, antiparasitic, anticancer and immunosuppressive activities. Many of these compounds and their semisynthetic derivatives are used today in the clinic. Most of the gene clusters encoding commercially important drugs have also been cloned and sequenced and their biosynthetic mechanisms studied in great detail. The area of biosynthetic engineering of the enzymes involved in polyketide biosynthesis has recently advanced and been transferred into the industrial arena. In this work, we introduce a computational system to provide the user with a wealth of information that can be utilized for biosynthetic engineering of enzymes involved in post-PKS tailoring steps. Post-PKS tailoring steps are necessary to add functional groups essential for the biological activity and are therefore important in polyketide biosynthesis.
Clone Cells
;
Multigene Family
3.Computational Approach for the Analysis of Post-PKS Glycosylation Step.
Genomics & Informatics 2008;6(4):223-226
We introduce a computational approach for analysis of glycosylation in Post-PKS tailoring steps. It is a computational method to predict the deoxysugar biosynthesis unit pathway and the substrate specificity of glycosyltransferases involved in the glycosylation of polyketides. In this work, a directed and weighted graph is introduced to represent and predict the deoxysugar biosynthesis unit pathway. In addition, a homology based gene clustering method is used to predict the substrate specificity of glycosyltransferases. It is useful for the rational design of polyketide natural products, which leads to in silico drug discovery.
Biological Agents
;
Computer Simulation
;
Glycosylation
;
Glycosyltransferases
;
Polyketides
;
Substrate Specificity
4.hpvPDB: An Online Proteome Reserve for Human Papillomavirus.
Satish KUMAR ; Lingaraja JENA ; Sangeeta DAF ; Kanchan MOHOD ; Peyush GOYAL ; Ashok K VARMA
Genomics & Informatics 2013;11(4):289-291
Human papillomavirus (HPV) infection is the leading cause of cancer mortality among women worldwide. The molecular understanding of HPV proteins has significant connotation for understanding their intrusion in the host and designing novel protein vaccines and anti-viral agents, etc. Genomic, proteomic, structural, and disease-related information on HPV is available on the web; yet, with trivial annotations and more so, it is not well customized for data analysis, host-pathogen interaction, strain-disease association, drug designing, and sequence analysis, etc. We attempted to design an online reserve with comprehensive information on HPV for the end users desiring the same. The Human Papillomavirus Proteome Database (hpvPDB) domiciles proteomic and genomic information on 150 HPV strains sequenced to date. Simultaneous easy expandability and retrieval of the strain-specific data, with a provision for sequence analysis and exploration potential of predicted structures, and easy access for curation and annotation through a range of search options at one platform are a few of its important features. Affluent information in this reserve could be of help for researchers involved in structural virology, cancer research, drug discovery, and vaccine design.
DNA Probes
;
Drug Design
;
Drug Discovery
;
Female
;
Genome
;
Host-Pathogen Interactions
;
Humans*
;
Mortality
;
Proteome*
;
Residence Characteristics
;
Sequence Analysis
;
Statistics as Topic
;
Vaccines
;
Virology
5.Molecular Vibration-Activity Relationship in the Agonism of Adenosine Receptors.
Genomics & Informatics 2013;11(4):282-288
The molecular vibration-activity relationship in the receptor-ligand interaction of adenosine receptors was investigated by structure similarity, molecular vibration, and hierarchical clustering in a dataset of 46 ligands of adenosine receptors. The resulting dendrogram was compared with those of another kind of fingerprint or descriptor. The dendrogram result produced by corralled intensity of molecular vibrational frequency outperformed four other analyses in the current study of adenosine receptor agonism and antagonism. The tree that was produced by clustering analysis of molecular vibration patterns showed its potential for the functional classification of adenosine receptor ligands.
Adenosine*
;
Classification
;
Dataset
;
Dermatoglyphics
;
Felodipine*
;
Ligands
;
Receptors, G-Protein-Coupled
;
Receptors, Purinergic P1*
;
Subject Headings
;
Vibration
6.Forensic Body Fluid Identification by Analysis of Multiple RNA Markers Using NanoString Technology.
Jong Lyul PARK ; Seong Min PARK ; Jeong Hwan KIM ; Han Chul LEE ; Seung Hwan LEE ; Kwang Man WOO ; Seon Young KIM
Genomics & Informatics 2013;11(4):277-281
RNA analysis has become a reliable method of body fluid identification for forensic use. Previously, we developed a combination of four multiplex quantitative PCR (qRT-PCR) probes to discriminate four different body fluids (blood, semen, saliva, and vaginal secretion). While those makers successfully identified most body fluid samples, there were some cases of false positive and negative identification. To improve the accuracy of the identification further, we tried to use multiple markers per body fluid and adopted the NanoString nCounter system instead of a multiplex qRT-PCR system. After measuring tens of RNA markers, we evaluated the accuracy of each marker for body fluid identification. For body fluids, such as blood and semen, each body fluid-specific marker was accurate enough for perfect identification. However, for saliva and vaginal secretion, no single marker was perfect. Thus, we designed a logistic regression model with multiple markers for saliva and vaginal secretion and achieved almost perfect identification. In conclusion, the NanoString nCounter is an efficient platform for measuring multiple RNA markers per body fluid and will be useful for forensic RNA analysis.
Body Fluids*
;
Logistic Models
;
Polymerase Chain Reaction
;
RNA*
;
Saliva
;
Semen
;
Transcutaneous Electric Nerve Stimulation
7.ERRATUM: Acknowledgments Correction. Cell-Free miR-27a, a Potential Diagnostic and Prognostic Biomarker for Gastric Cancer.
Jong Lyul PARK ; Mirang KIM ; Kyu Sang SONG ; Seon Young KIM ; Yong Sung KIM
Genomics & Informatics 2015;13(4):156-156
The funding acknowledgment in this article was partially omitted as published.
8.Building the Frequency Profile of the Core Promoter Element Patterns in the Three ChromHMM Promoter States at 200bp Intervals: A Statistical Perspective.
Heather LENT ; Kyung Eun LEE ; Hyun Seok PARK
Genomics & Informatics 2015;13(4):152-155
Recently, the Encyclopedia of DNA Elements (ENCODE) Analysis Working Group converted data from ChIP-seq analyses from the Broad Histone track into 15 corresponding chromatic maps that label sequences with different kinds of histone modifications in promoter regions. Here, we publish a frequency profile of the three ChromHMM promoter states, at 200-bp intervals, with particular reference to the existence of sequence patterns of promoter elements, GC-richness, and transcription starting sites. Through detailed and diligent analysis of promoter regions, researchers will be able to uncover new and significant information about transcription initiation and gene function.
DNA
;
Epigenomics
;
Histones
;
Promoter Regions, Genetic
9.Heritability Estimated Using 50K SNPs Indicates Missing Heritability Problem in Holstein Breeding.
Donghyun SHIN ; Kyoung Do PARK ; Sojoeng KA ; Heebal KIM ; Kwang Hyeon CHO
Genomics & Informatics 2015;13(4):146-151
Previous studies in Holstein have shown 35% to 51.8% heritability in milk production traits, such as milk yield, fat, and protein, using pedigree data. Other studies in complex human traits could be captured by common single-nucleotide polymorphisms (SNPs), and their genetic variations, attributed to chromosomes, are in proportion to their length. Using genome-wide estimation and partitioning approaches, we analyzed three quantitative Holstein traits relevant to milk production in Korean Holstein data harvested from 462 individuals genotyped for 54,609 SNPs. For all three traits (milk yield, fat, and protein), we estimated a nominally significant (p = 0.1) proportion of variance explained by all SNPs on the Illumina BovineSNP50 Beadchip (h(2)(G)). These common SNPs explained approximately most of the narrow-sense heritability. Longer genomic regions tended to provide more phenotypic variation information, with a correlation of 0.46~0.53 between the estimate of variance explained by individual chromosomes and their physical length. These results suggested that polygenicity was ubiquitous for Holstein milk production traits. These results will expand our knowledge on recent animal breeding, such as genomic selection in Holstein.
Animals
;
Breeding*
;
Genetic Variation
;
Humans
;
Milk
;
Pedigree
;
Polymorphism, Single Nucleotide*
10.Prediction of Genes Related to Positive Selection Using Whole-Genome Resequencing in Three Commercial Pig Breeds.
Hyoyoung KIM ; Kelsey CAETANO-ANOLLES ; Minseok SEO ; Young jun KWON ; Seoae CHO ; Kangseok SEO ; Heebal KIM
Genomics & Informatics 2015;13(4):137-145
Selective sweep can cause genetic differentiation across populations, which allows for the identification of possible causative regions/genes underlying important traits. The pig has experienced a long history of allele frequency changes through artificial selection in the domestication process. We obtained an average of 329,482,871 sequence reads for 24 pigs from three pig breeds: Yorkshire (n = 5), Landrace (n = 13), and Duroc (n = 6). An average read depth of 11.7 was obtained using whole-genome resequencing on an Illumina HiSeq2000 platform. In this study, cross-population extended haplotype homozygosity and cross-population composite likelihood ratio tests were implemented to detect genes experiencing positive selection for the genome-wide resequencing data generated from three commercial pig breeds. In our results, 26, 7, and 14 genes from Yorkshire, Landrace, and Duroc, respectively were detected by two kinds of statistical tests. Significant evidence for positive selection was identified on genes ST6GALNAC2 and EPHX1 in Yorkshire, PARK2 in Landrace, and BMP6, SLA-DQA1, and PRKG1 in Duroc.These genes are reportedly relevant to lactation, reproduction, meat quality, and growth traits. To understand how these single nucleotide polymorphisms (SNPs) related positive selection affect protein function, we analyzed the effect of non-synonymous SNPs. Three SNPs (rs324509622, rs80931851, and rs80937718) in the SLA-DQA1 gene were significant in the enrichment tests, indicating strong evidence for positive selection in Duroc. Our analyses identified genes under positive selection for lactation, reproduction, and meat-quality and growth traits in Yorkshire, Landrace, and Duroc, respectively.
Female
;
Gene Frequency
;
Haplotypes
;
Lactation
;
Meat
;
Polymorphism, Single Nucleotide
;
Reproduction
;
Swine
;
Natural Resources