A data-mining approach to biomarker identification from protein profiles using discrete stationary wavelet transform.
- Author:
Hussain MONTAZERY-KORDY
1
;
Mohammad Hossein MIRAN-BAYGI
;
Mohammad Hassan MORADI
Author Information
- Publication Type:Journal Article
- MeSH: Biomarkers, Tumor; blood; Breast Neoplasms; blood; Computational Biology; methods; Female; Humans; Neoplasm Proteins; blood; Ovarian Neoplasms; blood; Proteomics; methods; Reproducibility of Results; Sensitivity and Specificity; Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization; methods
- From: Journal of Zhejiang University. Science. B 2008;9(11):863-870
- CountryChina
- Language:English
-
Abstract:
OBJECTIVETo develop a new bioinformatic tool based on a data-mining approach for extraction of the most informative proteins that could be used to find the potential biomarkers for the detection of cancer.
METHODSTwo independent datasets from serum samples of 253 ovarian cancer and 167 breast cancer patients were used. The samples were examined by surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF MS). The datasets were used to extract the informative proteins using a data-mining method in the discrete stationary wavelet transform domain. As a dimensionality reduction procedure, the hard thresholding method was applied to reduce the number of wavelet coefficients. Also, a distance measure was used to select the most discriminative coefficients. To find the potential biomarkers using the selected wavelet coefficients, we applied the inverse discrete stationary wavelet transform combined with a two-sided t-test.
RESULTSFrom the ovarian cancer dataset, a set of five proteins were detected as potential biomarkers that could be used to identify the cancer patients from the healthy cases with accuracy, sensitivity, and specificity of 100%. Also, from the breast cancer dataset, a set of eight proteins were found as the potential biomarkers that could separate the healthy cases from the cancer patients with accuracy of 98.26%, sensitivity of 100%, and specificity of 95.6%.
CONCLUSIONThe results have shown that the new bioinformatic tool can be used in combination with the high-throughput proteomic data such as SELDI-TOF MS to find the potential biomarkers with high discriminative power.