Prediction of concentration immediately dangerous to life or health of benzene and its derivatives based on quantitative structure-activity relationship
- VernacularTitle:基于定量结构-活性关系预测苯及其衍生物的立即威胁生命或健康的浓度
- Author:
Xiongjun YUAN
1
;
Wei ZHAO
1
;
Jingjie SHI
1
;
Yue WANG
1
;
Changhao CHEN
2
Author Information
- Publication Type:Methodology
- Keywords: quantitative structure-activity relationship; immediately dangerous to life or health concentration; affinity propagation clustering algorithm; artificial neural network; predict
- From: Journal of Environmental and Occupational Medicine 2023;40(9):1033-1038
- CountryChina
- Language:Chinese
-
Abstract:
Background With the increasing exposure to hazardous chemicals in the workplace and frequency of occupational injuries and occupational safety accidents, the acquisition of occupational exposure limits of hazardous chemicals is imminent. Objective To obtain more unknown immediately dangerous to life or health (IDLH) concentrations of hazardous chemicals in the workplace by exploring the application of quantitative structure-activity relationship (QSAR) prediction method to IDLH concentrations, and to provide a theoretical basis and technical support for the assessment and prevention of occupational injuries. Methods QSAR was used to correlate the IDLH values of 50 benzene and its derivatives with the molecular structures of target compounds. Firstly, affinity propagation algorithm was applied to cluster sample sets. Secondly, Dragon 2.1 software was used to calculate and pre-screen 537 molecular descriptors. Thirdly, the genetic algorithm was used to select six characteristic molecular descriptors as dependent variables and to construct a multiple linear regression model (MLR) and two nonlinear models using support vector machine (SVM) and artificial neural network (ANN) respectively. Finally, model performance was evaluated by internal and external validation and Williams diagram was drawn to determine the scopes of selected models. Results The ANN model results showed that
=0.8526 and\begin{document}$ {R}_{\mathrm{t}\mathrm{r}\mathrm{a}\mathrm{i}\mathrm{n}}^{2} $\end{document} =0.8505 respectively, root mean square (RMSE) error=0.5243, mean absolute error (MAE)=0.4610, internal and external validation coefficients\begin{document}$ {R}_{\mathrm{t}\mathrm{e}\mathrm{s}\mathrm{t}}^{2} $\end{document} =0.8476 and\begin{document}$ {Q}_{\mathrm{l}00}^{2} $\end{document} =0.8905 respectively. By comparison, the performance verification parameters of the ANN model were superior to the MLR and SVM models, and all substances were in the applicable domain. Conclusion At present, the ANN model has the best performance in fitting ability, stability, and prediction, and is suitable for predicting IDLH concentrations of benzene and its derivatives. Predicting the IDLH concentraitons of benzene and its derivatives by QSAR method is an effective method, and provides a theoretical basis and technical support for the development of occupational health and safety.\begin{document}$ {Q}_{\mathrm{e}\mathrm{x}\mathrm{t}}^{2} $\end{document}