Comparison of the application effects of SARIMA, GAM and LSTM in prediction of hemorrhagic fever with renal syndrome
10.3760/cma.j.cn231583-20211025-00355
- VernacularTitle:SARIMA、GAM和LSTM在肾综合征出血热预测中的应用效果比较
- Author:
Tian LIU
1
;
Menglei YAO
;
Qingbo HOU
;
Jigui HUANG
;
Yang WU
;
Hongying CHEN
Author Information
1. 荆州市疾病预防控制中心传染病防治所,荆州 434000
- Keywords:
Hemorrhagic fever with renal syndrome;
Prediction;
Seasonal autoregressive integrated moving average model;
Generalized additive model;
Long-short term me
- From:
Chinese Journal of Endemiology
2022;41(9):709-714
- CountryChina
- Language:Chinese
-
Abstract:
Objective:To analyze the effects of seasonal autoregressive integrated moving average model (SARIMA), generalized additive model (GAM), and long-short term memory model (LSTM) in fitting and predicting the incidence of hemorrhagic fever with renal syndrome (HFRS), so as to provide references for optimizing the HFRS prediction model.Methods:The monthly incidence data of HFRS from 2004 to 2017 of the whole country and the top 9 provinces with the highest incidence of HFRS (Heilongjiang, Shaanxi, Jilin, Liaoning, Shandong, Hebei, Jiangxi, Zhejiang and Hunan) were collected in the Public Health Science Data Center (https://www.phsciencedata.cn/), of which the data from 2004 to 2016 were used as training data, and the data from January to December 2017 were used as test data. The SARIMA, GAM, and LSTM of HFRS incidence in the whole country and 9 provinces were fitted with the training data; the fitted model was used to predict the incidence of HFRS from January to December 2017, and compared with the test data. The mean absolute percentage error ( MAPE) was used to evaluate the model fitting and prediction accuracy. When MAPE < 20%, the model fitting or prediction effect was good, 20%-50% was acceptable, and > 50% was poor. Results:From the perspective of overall fitting and prediction effect, the optimal model for the whole country and Heilongjiang, Shaanxi, Jilin, Liaoning and Jiangxi was SARIMA ( MAPE was 19.68%, 20.48%, 44.25%, 19.59%, 23.82% and 35.29%, respectively), among which the fitting and prediction effects of the whole country and Jilin were good, and the rest were acceptable. The optimal model for Shandong and Zhejiang was GAM ( MAPE was 18.29% and 21.25%, respectively), the fitting and prediction effect of Shandong was good, and Zhejiang was acceptable. The optimal model for Hebei and Hunan was LSTM ( MAPE was 26.52% and 22.69%, respectively), and the fitting and prediction effects were acceptable. From the perspective of fitting effect, GAM had the highest fitting accuracy in the whole country data, with MAPE = 10.44%. From the perspective of prediction effect, LSTM had the highest prediction accuracy in the whole country data, with MAPE = 12.23%. Conclusions:SARIMA, GAM, and LSTM can all be used as the optimal models for fitting the incidence of HFRS, but the optimal models fitted in different regions show great differences. In the future, in the establishment of HFRS prediction models, as many alternative models as possible should be included for screening to ensure higher fitting and prediction accuracy.