Application of Python web crawler technology in infodemiology
10.3760/cma.j.cn112338-20190901-00643
- VernacularTitle:Python爬虫技术在信息流行病学中的应用
- Author:
Jiangjie ZHOU
1
;
Shengfeng WANG
;
Liming LI
Author Information
1. 北京大学公共卫生学院流行病与卫生统计学系 100191
- Keywords:
Python web crawler technology;
Infodemiology;
Public health surveillance;
Health intervention;
Smart doctor seeking
- From:
Chinese Journal of Epidemiology
2020;41(6):952-956
- CountryChina
- Language:Chinese
-
Abstract:
Python web crawler technology, which automatically and massively getting information from the Internet by mimicking net users’ browsing behavior, is a basic supporting technique to extract and integrate multi-source heterogeneous data in the field of Infodemiology. There are two types of Python web crawler: simple and massive-scale, both collect information simultaneously from the database establishment. Advantages of this technique are characterized as: being simple syntax, in high flexibility and low cost in learning and maintenance. Contents of the current application scenarios include surveillance, implementation and evaluation of health intervention programs on public health issues, as well as on smart doctor seeking. For the last two years, the Chinese government started to encourage the integration and utilization of multi-source heterogeneous data including internet information. Hence, the number of application scenarios for Python web crawler technology are bound to increase in the foreseeable future. Corresponding matched talent cultivations and technical innovations are suggested to add to the current education and research systems on public health issues.