1.Effect of Normalization on Detection of Differentially-Expressed Genes with Moderate Effects.
Seoae CHO ; Eunjee LEE ; Youngchul KIM ; Taesung PARK
Genomics & Informatics 2007;5(3):118-123
The current existing literature offers little guidance on how to decide which method to use to analyze one-channel microarray measurements when dealing with large, grouped samples. Most previous methods have focused on two-channel data;therefore they can not be easily applied to one-channel microarray data. Thus, a more reliable method is required to determine an appropriate combination of individual basic processing steps for a given dataset in order to improve the validity of onechannel expression data analysis. We address key issues in evaluating the effectiveness of basic statistical processing steps of microarray data that can affect the final outcome of gene expression analysis without focusingon the intrinsic data underlying biological interpretation.
Analysis of Variance
;
Dataset
;
Gene Expression
;
Statistics as Topic
2.Joint Identification of Multiple Genetic Variants of Obesity in a Korean Genome-wide Association Study.
Sohee OH ; Seoae CHO ; Taesung PARK
Genomics & Informatics 2010;8(3):142-149
In recent years, genome-wide association (GWA) studies have successfully led to many discoveries of genetic variants affecting common complex traits, including height, blood pressure, and diabetes. Although GWA studies have made much progress in finding single nucleotide polymorphisms (SNPs) associated with many complex traits, such SNPs have been shown to explain only a very small proportion of the underlying genetic variance of complex traits. This is partly due to that fact that most current GWA studies have relied on single-marker approaches that identify single genetic factors individually and have limitations in considering the joint effects of multiple genetic factors on complex traits. Joint identification of multiple genetic factors would be more powerful and provide a better prediction of complex traits, since it utilizes combined information across variants. Recently, a new statistical method for joint identification of genetic variants for common complex traits via the elastic-net regularization method was proposed. In this study, we applied this joint identification approach to a large-scale GWA dataset (i.e., 8842 samples and 327,872 SNPs) in order to identify genetic variants of obesity for the Korean population. In addition, in order to test for the biological significance of the jointly identified SNPs, gene ontology and pathway enrichment analyses were further conducted.
Blood Pressure
;
Genome-Wide Association Study
;
Joints
;
Obesity
;
Polymorphism, Single Nucleotide
3.Prediction of Genes Related to Positive Selection Using Whole-Genome Resequencing in Three Commercial Pig Breeds.
Hyoyoung KIM ; Kelsey CAETANO-ANOLLES ; Minseok SEO ; Young jun KWON ; Seoae CHO ; Kangseok SEO ; Heebal KIM
Genomics & Informatics 2015;13(4):137-145
Selective sweep can cause genetic differentiation across populations, which allows for the identification of possible causative regions/genes underlying important traits. The pig has experienced a long history of allele frequency changes through artificial selection in the domestication process. We obtained an average of 329,482,871 sequence reads for 24 pigs from three pig breeds: Yorkshire (n = 5), Landrace (n = 13), and Duroc (n = 6). An average read depth of 11.7 was obtained using whole-genome resequencing on an Illumina HiSeq2000 platform. In this study, cross-population extended haplotype homozygosity and cross-population composite likelihood ratio tests were implemented to detect genes experiencing positive selection for the genome-wide resequencing data generated from three commercial pig breeds. In our results, 26, 7, and 14 genes from Yorkshire, Landrace, and Duroc, respectively were detected by two kinds of statistical tests. Significant evidence for positive selection was identified on genes ST6GALNAC2 and EPHX1 in Yorkshire, PARK2 in Landrace, and BMP6, SLA-DQA1, and PRKG1 in Duroc.These genes are reportedly relevant to lactation, reproduction, meat quality, and growth traits. To understand how these single nucleotide polymorphisms (SNPs) related positive selection affect protein function, we analyzed the effect of non-synonymous SNPs. Three SNPs (rs324509622, rs80931851, and rs80937718) in the SLA-DQA1 gene were significant in the enrichment tests, indicating strong evidence for positive selection in Duroc. Our analyses identified genes under positive selection for lactation, reproduction, and meat-quality and growth traits in Yorkshire, Landrace, and Duroc, respectively.
Female
;
Gene Frequency
;
Haplotypes
;
Lactation
;
Meat
;
Polymorphism, Single Nucleotide
;
Reproduction
;
Swine
;
Natural Resources
4.The Usage of an SNP-SNP Relationship Matrix for Best Linear Unbiased Prediction (BLUP) Analysis Using a Community-Based Cohort Study.
Young Sup LEE ; Hyeon Jeong KIM ; Seoae CHO ; Heebal KIM
Genomics & Informatics 2014;12(4):254-260
Best linear unbiased prediction (BLUP) has been used to estimate the fixed effects and random effects of complex traits. Traditionally, genomic relationship matrix-based (GRM) and random marker-based BLUP analyses are prevalent to estimate the genetic values of complex traits. We used three methods: GRM-based prediction (G-BLUP), random marker-based prediction using an identity matrix (so-called single-nucleotide polymorphism [SNP]-BLUP), and SNP-SNP variance-covariance matrix (so-called SNP-GBLUP). We used 35,675 SNPs and R package "rrBLUP" for the BLUP analysis. The SNP-SNP relationship matrix was calculated using the GRM and Sherman-Morrison-Woodbury lemma. The SNP-GBLUP result was very similar to G-BLUP in the prediction of genetic values. However, there were many discrepancies between SNP-BLUP and the other two BLUPs. SNP-GBLUP has the merit to be able to predict genetic values through SNP effects.
Cohort Studies*
;
Polymorphism, Single Nucleotide