Multivariate statistical analysis for metabolomic data: the key points in principal component analysis
10.16438/j.0513-4870.2017-1288
- VernacularTitle:代谢组学数据处理——主成分分析十个要点问题
- Author:
Ji-ye A
1
;
Jun HE
1
;
Run-bin SUN
1
Author Information
1. Jiangsu Province Key Laboratory of Drug Metabolism and Pharmacokinetics, Laboratory of Metabolomics, Jiangsu Key Laboratory of Drug Design and Optimization, State Key Laboratory of Natural Medicines, China Pharmaceutical University, Nanjing 210009, China
- Publication Type:REVIEWS
- Keywords:
metabolomics;
principal component analysis;
system biology;
multivariate statistical analysis;
principal component
- From:
Acta Pharmaceutica Sinica
2018;53(6):929-937
- CountryChina
- Language:Chinese
-
Abstract:
Metabolomics data contains multiple variables usually processed and evaluated by means of principal components analysis. The statistical analysis of the multivariate data is involved in abstract, elusory fitting for the model of hyperspace, complicated theoretical arithmetic and sophisticated transformation of the data matrix. It is crucially important to understand the arithmetic mechanism and the properties of the models fully. In this article, we reviewed the key and puzzling issues in principal components analysis of the metabolomics data, including the principal components, the scores and loadings of a principal components, scaling and weighting, partial least square projection to latent structures, partial least squares discriminant analysis, orthogonal projection to latent structure, orthogonal bidirectional projections to latent structures, S-plot, shared and unique structure plot, and the validation of the model. Hopefully, this article provides a better understanding of data processing mode, model selection, procedure standardization, and data interpretation for a reliable conclusion.