Identification of breast cancer and its molecular sub-types via Raman spectroscopy combined with machine learning algorithms
10.3760/cma.j.cn121382-20240222-00303
- VernacularTitle:拉曼光谱结合机器学习算法对乳腺癌及其分子亚型的识别
- Author:
Juan LI
1
;
Chao YANG
;
Jiayi TANG
;
Jingjing XIA
;
Haojun LIU
;
Ahmat ZULHUMAR·
;
Xin’en CAI
;
Maimaitijiang AYITILA·
Author Information
1. 新疆大学生命科学与技术学院,乌鲁木齐 830017
- Keywords:
Raman spectroscopy;
Breast cancer;
Molecular sub-type;
Machine learning
- From:
International Journal of Biomedical Engineering
2024;47(3):219-226
- CountryChina
- Language:Chinese
-
Abstract:
Objective:To develop a simple, rapid, and convenient analysis method for the identification of breast cancer and its molecular sub-types.Methods:A laser confocal Raman spectrometer was used to collect Raman spectrograms of normal breast cells and different molecular sub-types of breast cancer cells, and assign the material origin of the Raman spectral peaks. First, Savitzky-Golay smoothing (with a window size of 9) was selected to perform smoothing and denoising on the Raman spectrogram. Subsequently, an iterative adaptive weighted penalty least squares method was employed for baseline correction, and principal component analysis was used to eliminate outliers. The recognition model of normal breast cells and breast cancer cells and the recognition model of different molecular sub-types of breast cancer cells were established by using three algorithms with different principles, including partial least squares discriminant analysis (PLS-DA), K-nearest neighbor (KNN), and support vector machine (SVM).Results:The Raman spectrogram and Raman peak shifts of normal breast cells and breast cancer cells were similar, but there were significant differences in intensity. The results of the machine learning models showed that the recognition accuracy of PLS-DA and SVM algorithms for distinguishing between normal breast cells and breast cancer cells was above 92.03% and 90.67%, respectively. The recognition accuracy of PLS-DA and SVM algorithms for different molecular sub-types of breast cancer cells was (83.66 ± 2.77)% and (90.55 ± 0.06)%, respectively.Conclusions:The combination of Raman spectroscopy and machine learning algorithms can achieve accurate identification of normal breast cells, breast cancer cells, and different molecular sub-types of breast cancer cells.