Significant Gene Selection Using Integrated Microarray Data Set with Batch Effect.
- Author:
Ki Yeol KIM
1
;
Hyun Cheol CHUNG
;
Hei Cheul JEUNG
;
Ji Hye SHIN
;
Tae Soo KIM
;
Sun Young RHA
Author Information
1. Oral Cancer Research Institute, Yonsei University College of Dentistry, Seoul, Korea.
- Publication Type:Original Article
- Keywords:
genomic data;
integration;
batch effect;
bioinformatics
- MeSH:
Bias (Epidemiology);
Computational Biology;
Dataset*;
Genes, vif;
RNA
- From:Genomics & Informatics
2006;4(3):110-117
- CountryRepublic of Korea
- Language:English
-
Abstract:
In microarray technology, many diverse experimental features can cause biases including RNA sources, microarray production or different platforms, diverse sample processing and various experiment protocols. These systematic effects cause a substantial obstacle in the analysis of microarray data. When such data sets derived from different experimental processes were used, the analysis result was almost inconsistent and it is not reliable. Therefore, one of the most pressing challenges in the microarray field is how to combine data that comes from two different groups. As the novel trial to integrate two data sets with batch effect, we simply applied standardization to microarray data before the significant gene selection. In the gene selection step, we used new defined measure that considers the distance between a gene and an ideal gene as well as the between-slide and within-slide variations. Also we discussed the association of biological functions and different expression patterns in selected discriminative gene set. As a result, we could confirm that batch effect was minimized by standardization and the selected genes from the standardized data included various expression pattems and the significant biological functions.