Chinese Journal of Analytical Chemistry 2015;(7):1086-1091

doi:10.11895/j.issn.0253-3820.150205

Ensemble Partial Least Squares Algorithm Based on Variable Clustering for Quantitative Infrared Spectrometric Analysis

Yiming BI ; Guohai CHU ; Jizhong WU ; Kailong YUAN ; Jian WU ; Fu LIAO ; Jun XIA ; Guangxin ZHANG ; Guojun ZHOU

Keywords

Chemometrics; Partial least squares; Quantitative analysis; Spectrometric analysis; Model ensemble

Country

China

Language

Chinese

Abstract

Due to the ability of overcoming both the dimensionality and the collinear problems of the spectral data, partial least squares ( PLS ) is in ever increasingly used for quantitative spectrometric analysis, especially for near-infrared spectrum, mid-infrared spectrum and Raman spectrum. In this work, an improved PLS algorithm is proposed for efficient information extraction and noise reduction. The spectral variables are clustering to several subsets, and several sub-models are built for each subset. Then, the sub-models are re-weighted and ensemble to the final model. Experiments on two near-infrared datasets ( octane number prediction in gasoline and nicotine prediction in tobacco leafs ) demonstrate that the new method provides superior prediction performance and outperformed the conventional PLS algorithm, and the root mean square error of prediction ( RMSEP) is reduced by 32% and 22%, respectively.