1.Comparison of Normalization Methods for Defining Copy Number Variation Using Whole-genome SNP Genotyping Data.
Ji Hong KIM ; Seon Hee YIM ; Yong Bok JEONG ; Seong Hyun JUNG ; Hai Dong XU ; Seung Hun SHIN ; Yeun Jun CHUNG
Genomics & Informatics 2008;6(4):231-234
Precise and reliable identification of CNV is still important to fully understand the effect of CNV on genetic diversity and background of complex diseases. SNP marker has been used frequently to detect CNVs, but the analysis of SNP chip data for identifying CNV has not been well established. We compared various normalization methods for CNV analysis and suggest optimal normalization procedure for reliable CNV call. Four normal Koreans and NA10851 HapMap male samples were genotyped using Affymetrix Genome-Wide Human SNP array 5.0. We evaluated the effect of median and quantile normalization to find the optimal normalization for CNV detection based on SNP array data. We also explored the effect of Robust Multichip Average (RMA) background correction for each normalization process. In total, the following 4 combinations of normalization were tried: 1) Median normalization without RMA background correction, 2) Quantile normalization without RMA background correction, 3) Median normalization with RMA background correction, and 4) Quantile ormalization with RMA background correction. CNV was called using SW-ARRAY algorithm. We applied 4 different combinations of normalization and compared the effect using intensity ratio profile, box plot, and MA plot. When we applied median and quantile normalizations without RMA background correction, both methods showed similar normalization effect and the final CNV calls were also similar in terms of number and size. In both median and quantile normalizations, RMA background correction resulted in widening the range of intensity ratio distribution, which may suggest that RMA background correction may help to detect more CNVs compared to no correction.
Coat Protein Complex I
;
Genetic Variation
;
HapMap Project
;
Humans
;
Male
2.Computational Approach for Biosynthetic Engineering of Post-PKS Tailoring Enzymes.
Genomics & Informatics 2008;6(4):227-230
Compounds of polyketide origin possess a wealth of pharmacological effects, including antibacterial, antifungal, antiparasitic, anticancer and immunosuppressive activities. Many of these compounds and their semisynthetic derivatives are used today in the clinic. Most of the gene clusters encoding commercially important drugs have also been cloned and sequenced and their biosynthetic mechanisms studied in great detail. The area of biosynthetic engineering of the enzymes involved in polyketide biosynthesis has recently advanced and been transferred into the industrial arena. In this work, we introduce a computational system to provide the user with a wealth of information that can be utilized for biosynthetic engineering of enzymes involved in post-PKS tailoring steps. Post-PKS tailoring steps are necessary to add functional groups essential for the biological activity and are therefore important in polyketide biosynthesis.
Clone Cells
;
Multigene Family
3.Computational Approach for the Analysis of Post-PKS Glycosylation Step.
Genomics & Informatics 2008;6(4):223-226
We introduce a computational approach for analysis of glycosylation in Post-PKS tailoring steps. It is a computational method to predict the deoxysugar biosynthesis unit pathway and the substrate specificity of glycosyltransferases involved in the glycosylation of polyketides. In this work, a directed and weighted graph is introduced to represent and predict the deoxysugar biosynthesis unit pathway. In addition, a homology based gene clustering method is used to predict the substrate specificity of glycosyltransferases. It is useful for the rational design of polyketide natural products, which leads to in silico drug discovery.
Biological Agents
;
Computer Simulation
;
Glycosylation
;
Glycosyltransferases
;
Polyketides
;
Substrate Specificity
4.Unfolded Histidine-Tagged Protein is Immobilized to Nitrilotriacetic Acid-Nickel Beads, But Not the Nickel-Coated Glass Slide.
Minho CHO ; Sunyoung AHN ; Heonyong PARK
Genomics & Informatics 2006;4(3):133-136
The adsorption of proteins on the surface of glass slides is essential for construction of protein chips. Previously, we prepared a nickel-coated plate by the spin-coating method for immobilization of His-tagged proteins. In order to know whether the structural factor is responsible for the immobilization of His-tagged proteins to the nickel-coated glass slide, we executed a series of experiments. First we purified a His-tagged protein after expressing the vector in E. coli BL21 (DE3). Then we obtained the unfolding curve for the His-tagged protein by using guanidine hydrochloride. Fractions unfolded were monitored by internal fluorescence spectroscopy. The delta G(H20) for unfolding was 2.27 kcalmol +/- 0.52. Then we tested if unfolded His-tagged proteins can be adsorbed to the nickel-coated plate, comparing with Ni2+ -NTA (nitrilotriacetic acid) beads. Whereas unfolded His-tagged proteins were adsorbed to Ni2+ -NTA beads, they did not bind to the nickel-coated plate. In conclusion, a structural factor is likely to be an important factor for constructing the protein chips, when His-tagged proteins will immobilize to the nickel-coated slides.
Adsorption
;
Fibrinogen
;
Glass*
;
Guanidine
;
Immobilization
;
Protein Array Analysis
;
Spectrometry, Fluorescence
5.GraPT: Genomic InteRpreter about Predictive Toxicology.
Jung Hoon WOO ; Yu Rang PARK ; Yong JUNG ; Ji Hun KIM ; Ju Han KIM
Genomics & Informatics 2006;4(3):129-132
Toxicogenomics has recently emerged in the field of toxicology and the DNA microarray technique has become common strategy for predictive toxicology which studies molecular mechanism caused by exposure of chemical or environmental stress. Although microarray experiment offers extensive genomic information to the researchers, yet high dimensional characteristic of the data often makes it hard to extract meaningful result. Therefore we developed toxicant enrichment analysis similar to the common enrichment approach. We also developed web-based system graPT to enable considerable prediction of toxic endpoints of experimental chemical.
Oligonucleotide Array Sequence Analysis
;
Toxicogenetics
;
Toxicology*
6.BioCC: An Openfree Hypertext Bio Community Cluster for Biology.
Sungsam GONG ; Tae Hyung KIM ; Jungsu OH ; Jekeun KWON ; Su An CHO ; Dan BOLSER ; Jong BHAK
Genomics & Informatics 2006;4(3):125-128
We present an openfree hypertext (also known as wiki) web cluster called BioCC. BioCC is a novel wiki farm that lets researchers create hundreds of biological web sites. The web sites form an organic information network. The contents of all the sites on the BioCC wiki farm are modifiable by anonymous as well as registered users. This enables biologists with diverse backgrounds to form their own Internet bio-communities. Each community can have custom-made layouts for information, discussion, and knowledge exchange. BioCC aims to form an ever-expanding network of openfree biological knowledge databases used and maintained by biological experts, students, and general users. The philosophy behind BioCC is that the formation of biological knowledge is best achieved by open-minded individuals freely exchanging information. In the near future, the amount of genomic information will have flooded society. BioGG can be an effective and quickly updated knowledge database system. BioCC uses an opensource wiki system called Mediawiki. However, for easier editing, a modified version of Mediawiki, called Biowiki, has been applied. Unlike Mediawiki, Biowiki uses a WYSIWYG (What You See Is What You Get) text editor. BioCC is under a share-alike license called BioLicense (http://biolicense.org). The BioCC top level site is found at http://bio.cc/
Anonyms and Pseudonyms
;
Biology*
;
Computational Biology
;
Humans
;
Hypermedia*
;
Information Services
;
Internet
;
Licensure
;
Linear Energy Transfer
;
Philosophy
7.A Metabolic Pathway Drawing Algorithm for Reducing the Number of Edge Crossings.
Eun Ha SONG ; Min Kyung KIM ; Sang Ho LEE
Genomics & Informatics 2006;4(3):118-124
For the direct understanding of flow, pathway data are usually represented as directed graphs in biological journals and texts. Databases of metabolic pathways or signal transduction pathways inevitably contain these kinds of graphs to show the flow. KEGG, one of the representative pathway databases, uses the manually drawn figure which can not be easily maintained. Graph layout algorithms are applied for visualizing metabolic pathways in some databases, such as EcoCyc. Although these can express any changes of data in the real time, it exponentially increases the edge crossings according to the increase of nodes. For the understanding of genome scale flow of metabolism, it is very important to reduce the unnecessary edge crossings which exist in the automatic graph layout. We propose a metabolic pathway drawing algorithm for reducing the number of edge crossings by considering the fact that metabolic pathway graph is scale-free network. The experimental results show that the number of edge crossings is reduced about 37~40% by the consideration of scale-free network in contrast with non-considering scale-free network. And also we found that the increase of nodes do not always mean that there is an increase of edge crossings.
Genome
;
Metabolic Networks and Pathways*
;
Metabolism
;
Signal Transduction
8.Significant Gene Selection Using Integrated Microarray Data Set with Batch Effect.
Ki Yeol KIM ; Hyun Cheol CHUNG ; Hei Cheul JEUNG ; Ji Hye SHIN ; Tae Soo KIM ; Sun Young RHA
Genomics & Informatics 2006;4(3):110-117
In microarray technology, many diverse experimental features can cause biases including RNA sources, microarray production or different platforms, diverse sample processing and various experiment protocols. These systematic effects cause a substantial obstacle in the analysis of microarray data. When such data sets derived from different experimental processes were used, the analysis result was almost inconsistent and it is not reliable. Therefore, one of the most pressing challenges in the microarray field is how to combine data that comes from two different groups. As the novel trial to integrate two data sets with batch effect, we simply applied standardization to microarray data before the significant gene selection. In the gene selection step, we used new defined measure that considers the distance between a gene and an ideal gene as well as the between-slide and within-slide variations. Also we discussed the association of biological functions and different expression patterns in selected discriminative gene set. As a result, we could confirm that batch effect was minimized by standardization and the selected genes from the standardized data included various expression pattems and the significant biological functions.
Bias (Epidemiology)
;
Computational Biology
;
Dataset*
;
Genes, vif
;
RNA
9.Association of the Human IL-28RA Gene Polymorphisms in a Korean Population with Asthma.
Soo Cheon CHAE ; Young Ran PARK ; Yong Chul LEE ; Yun Sik YANG ; Hun Taeg CHUNG
Genomics & Informatics 2006;4(3):103-109
IL-28RA is one of the important candidate genes for complex trait of genetic diseases, but there are only a few published results for this gene. Previously, we identified eighteen SNPs and two variation sites in the entire coding regions of IL-28RA including promoter regions, and suggested that the g.32349G > A polymorphism of IL-28RA might be associated with susceptibility to allergic rhinitis. In this study, we chose seven SNPs (g.-1193A > C, g.-30C > T, g.17654C > T, g.27798A > G, g.31265C > T, g.31911C > T and g.32349G > A) of IL-28RA, and attempted to find out whether these polymorphisms were furtherassociated with genetic predisposition of asthma. We analyzed the genotype and allele frequencies of IL-28RA polymorph isms between the asthma patients and healthy controls. We also investigated the frequencies of haplotype constructed by these SNPs between asthma patients and controls. Our results suggest that the polymorphisms of IL-28RA gene were not associated with susceptibility to asthma, and not with IgE production and eosinophil recruitment. The haplotype frequencies by these SNPs also not significantly associated between the healthy controls and asthma patients. This result indicates that the IL-2BRA polymorphisms might be not associated withasthma susceptibility.
Asthma*
;
Clinical Coding
;
Eosinophils
;
Gene Frequency
;
Genetic Predisposition to Disease
;
Genotype
;
Haplotypes
;
Humans*
;
Immunoglobulin E
;
Polymorphism, Single Nucleotide
;
Promoter Regions, Genetic
;
Rhinitis
10.Identification of Gene Expression Signatures in Korean Acute Leukemia Patients.
Kyung Hun LEE ; Se Won PARK ; Inho KIM ; Sung Soo YOON ; Seonyang PARK ; Byoung Kook KIM
Genomics & Informatics 2006;4(3):97-102
BACKGROUND: In acute leukemia patients, several successful methods of expression profiling have been used for various purposes, i.e., to identify new disease class, to select a therapeutic target, or to predict chemo-sensitivity and clinical outcome. In the present study, we tested the peripheral blood of 47 acute leukemia patients in an attempt to identify differentially expressed genes in AML and ALL using a Korean-made 10K oligo-nucleotide microarray. METHODS: Total RNA was prepared from peripheral blood and amplified for microarray experimentation. SAM (significant analysis of microarray) and PAM (prediction analysis of microarray) were used to select significant genes. The selected genes were tested for in a test group, independently of the training group. RESULTS: We identified 345 differentially expressed genes that differentiated AML and ALL patients (FWER < 0.05). Genes were selected using the training group (n=35) and tested for in the test group (n=12). Both training group and test group discriminated AML and ALL patients accurately. Genes that showed relatively high expression in AML patients were deoxynucleotidyl transferase, pre-B lymphocyte gene 3, B-cell linker, CD9 antigen, lymphoid enhancer-binding factor 1, CD79B antigen, and early B-cell factor. Genes highly expressed in ALL patients were annexin A 1, amyloid beta (A4) precursor protein, amyloid beta (A4) precursor-like protein 2, cathepsin C, lysozyme (renal amyloidosis), myeloperoxidase, and hematopoietic prostaglandin D2 synthase. CONCLUSION: This study provided genome wide molecular signatures of Korean acute leukemia patients, which clearly identify AML and ALL. Given with other reported signatures, these molecular signatures provide a means of achieving a molecular diagnosis in Korean acute leukemia patents.
Amyloid
;
Antigens, CD79
;
Antigens, CD9
;
B-Lymphocytes
;
Cathepsin C
;
Diagnosis
;
DNA Nucleotidylexotransferase
;
Gene Expression*
;
Genome
;
Humans
;
Leukemia*
;
Leukemia, Myeloid, Acute
;
Lymphoid Enhancer-Binding Factor 1
;
Muramidase
;
Peroxidase
;
Precursor Cell Lymphoblastic Leukemia-Lymphoma
;
Precursor Cells, B-Lymphoid
;
Prostaglandin D2
;
RNA
;
Transcriptome*