Bacterial community characteristics in water from public baths in Shanghai and their association with Legionella pneumophila contamination based on 16S rRNA sequencing and random forest model
- VernacularTitle:基于16S rRNA测序与随机森林模型的上海市公共浴池水细菌群落特征及其与嗜肺军团菌污染的关联
- Author:
Lisha SHI
1
;
Jian CHEN
1
;
Xiaojing LI
1
;
Yiming ZHENG
1
;
Lijun ZHANG
1
Author Information
- Publication Type:Investigation
- Keywords: public bath water; 16S rRNA sequencing; random forest model; community structure; diversity; Legionella pneumophila
- From: Journal of Environmental and Occupational Medicine 2026;43(1):82-88
- CountryChina
- Language:Chinese
- Abstract: Background The contamination of public baths with Legionella pneumophila contamination has become a growing public health concern in recent years. However, research on its association with bacterial community characteristics in water samples remains limited. The integration of 16S rRNA sequencing and random forest modeling provides a new approach to elucidate the bacterial community characteristics of public bath water and their association with Legionella pneumophila contamination. Objective To investigate the bacterial community structure and diversity of public bath water in Shanghai, explore the association between Legionella pneumophila contamination and bacterial community characteristics, and identify key bacterial genera associated with contamination, thereby providing a scientific basis for formulating hygiene management regulations for public bath water. Methods From February to March 2023, water samples were collected from ten public baths in Shanghai which were selected based on business scale, regional distribution, and functional differences. Water quality parameters were evaluated, and the samples were categorized into Legionella-positive and Legionella-negative groups based on the detection results of Legionella pneumophila. The bacterial community structure, α-diversity, and β-diversity were analyzed using 16S rRNA sequencing. Redundancy analysis (RDA) was employed to examine the relationship between physicochemical factors and bacterial community diversity. A random forest model was employed to identify key bacterial genera distinguishing the two groups, with the importance of genera being evaluated based on the mean decrease accuracy (MDA). Results The oxygen consumption in the Legionella-positive group was significantly lower than that in the Legionella-negative group (mean values: 1.85 mg·L−1 vs. 6.81 mg·L−1, P< 0.05), while no significant differences were observed in other physicochemical indicators. The sequencing results revealed a total of 27 bacterial phyla and 454 bacterial genera, with Proteobacteria (63.00%) being the dominant phylum. The dominant genera included Pelomonas (8.50%), Acidovorax (8.13%), Mycobacterium (7.93%), and Acinetobacter (6.59%). The α-diversity analysis indicated that bacterial community richness (Chao1 and ACE indices) was significantly higher in the Legionella-positive group than in the Legionella-negative group (P<0.01). The β-diversity analysis showed no significant difference in the bacterial community structure between the two groups (P>0.05). The RDA analysis demonstrated that the bacterial community diversity was positively correlated with pH and negatively correlated with oxygen consumption and free residual chlorine. The RDA1 and RDA2 explained 23.92% and 21.30% of the bacterial community diversity, respectively. The random forest model identified 20 key genera significantly influencing the microbial community distribution between the two groups, including unclassified_Bradyrhizobiaceae (MDA=2.42), Meiothermus (MDA=2.37), and Flavihumibacter (MDA=2.26). Conclusion The diversity of bacterial communities in public bath water is influenced by pH, oxygen consumption, and free residual chlorine. Samples contaminated with Legionella pneumophila exhibit greater microbial richness and contain characteristic key bacterial genera that contribute to community differences. Machine learning random forest technology helps identify these distinctive key bacterial genera. The findings provide a basis for carrying out risk early warning strategies in such settings.
