Machine learning vs. regression models to predict the risk of Legionella contamination in a hospital water network
Keywords:
Machine learning; Water network; Hospital; Artificial Intelligence, LegionellaAbstract
Introduction. The periodic monitoring of Legionella in hospital water networks allows preventive measures to be taken to avoid the risk of legionellosis to patients and healthcare workers. Study design. The aim of the study is to standardize a method for predicting the risk of Legionella contamination in the water supply of a hospital facility, by comparing Machine Learning, conventional and combined models. Methods. During the period July 2021– October 2022, water sampling for Legionella detection was performed in the rooms of an Italian hospital pavilion (89.9% of the total number of rooms). Fifty-eight parameters regarding the structural and environmental characteristics of the water network were collected. Models were built on 70% of the dataset and tested on the remaining 30% to evaluate accuracy, sensitivity, and specificity. Results. A total of 1,053 water samples were analyzed and 57 (5.4%) were positive for Legionella. Of the Machine Learning models tested, the most efficient had an input layer (56 neurons), hidden layer (30 neurons), and output layer (two neurons). Accuracy was 93.4%, sensitivity was 43.8%, and specificity was 96%. The regression model had an accuracy of 82.9%, sensitivity of 20.3%, and specificity of 97.3%. The combination of the models achieved an accuracy of 82.3%, sensitivity of 22.4%, and specificity of 98.4%. The most important parameters that influenced the model results were the type of water network (hot/cold), the replacement of filter valves, and atmospheric temperature. Among the models tested, Machine Learning obtained the best results in terms of accuracy and sensitivity. Conclusions. Future studies are required to improve these predictive models by expanding the dataset using other parameters and other pavilions of the same hospital.
References
1. 2. Fields BS, Benson RF, Besser RE. Legionella and Legion-
naires’ disease: 25 years of investigation. Clin Microbiol
Rev. 2002 Jul;15(3):506-26. doi: 10.1128/CMR.15.3.506-
526.2002. PMID: 12097254.
Iliadi V, Staykova J, Iliadis S, Konstantinidou I, Sivykh P,
Romanidou G, et al. Legionella pneumophila: The Journey
from the Environment to the Blood. J Clin Med. 2022
Oct 18;11(20):6126. doi: 10.3390/jcm11206126. PMID:
36294446.
3. Samuelsson J, Payne Hallström L, Marrone G, Gomes Dias.
J. Legionnaires’ disease in the EU/EEA*: increasing trend
from 2017 to 2019. Euro. Surveill. 2023, 28(11), 2200114.
doi: 10.2807/1560-7917.ES.2023.28.11.2200114. PMID:
36927719.
4. Guidelines for drinking-water quality: Third edition. Gene-
va: World Health Organization; 2004.
5. Guidelines for drinking-water quality: Fourth edition incor-
porating the first and second addenda. Geneva: World Health
Organization; 2022. Available from: https://www.who.int/
publications/i/item/9789240045064 [Last accessed: 2024
May 20].
6. Direttiva (E.U.), 2020/2184 del. Parlamento Europeo e del
Consiglio del 16 dicembre 2020 Concernente la Qualità del-
le Acque Destinate al Consumo Umano. G.U. dell’Unione
Europea L 435/1 del 23 dicembre 2020. Available from:
http://data.europa.eu/eli/dir/2020/2184/oj [Last accessed:
2024 May 20].
7. Legislative Decree 18 February 2023 concerning the
implementation of Directive (EU) 2020/2184 of the Euro-
pean Parliament and of the Council of 16 December 2020
concerning the quality of water intended for human con-
sumption. Available from: https://www.gazzettaufficiale.it/
eli/id/2023/03/06/23G00025/SG [Last accessed: 2024 May
20].
8. European Centre for Disease Prevention and Control
(ECDC). Legionnaires’ Disease: Annual Epidemiological
Report for 2019. Annual Epidemiological Report on Com-
municable Diseases in Europe. Stockholm: ECDC; 2021.
9. Fischer FB, Saucy A, Vienneau D, Hattendorf J, Fanderl
J, de Hoogh K, et al. Impacts of weather and air pollution
on Legionnaires’ disease in Switzerland: A national case-
crossover study. Environ Res. 2023 Sep 15; 233:116327. doi:
10.1016/j.envres.2023.116327. Epub 2023 Jun 22. PMID:
37354934.
10. Graham FF, Harte D, Zhang J, Fyfe C, Baker MG. Increa-
sed Incidence of Legionellosis after Improved Diagnostic
Methods, New Zealand, 2000-2020. Emerg Infect Dis. 2023
Jun;29(6):1173-1182. doi: 10.3201/eid2906.221598. PMID:
37209673.
11. Centers for Disease Control and Prevention Legionnaires’
Disease: Use Water Management Programs in Buildings to
Help Prevent Outbreaks, 2016. Available from: https://www.
cdc.gov/vitalsigns/legionnaires/index.html [Last accessed:
2024 May 20].
12. Kanarek P, Bogiel T, Breza-Boruta B. Legionellosis risk-an
overview of Legionella spp. habitats in Europe. Environ
Sci Pollut Res Int. 2022 Nov;29(51):76532-76542. doi:
10.1007/s11356-022-22950-9. Epub 2022 Sep 26. PMID:
36161570.
13. De Giglio O, Diella G, Lopuzzo M, Triggiano F, Calia C,
Pousis C, et al. Management of Microbiological Conta-
mination of the Water Network of a Newly Built Hospital
Pavilion. Pathogens. 2021 Jan 16;10(1),75. doi: 10.3390/
pathogens10010075.
14. Ghaznavi C, Ishikane M, Yoneoka D, Tanoue Y, Kawashi-
ma T, Eguchi A, et al. Effect of the COVID-19 pandemic
138 15. 16. 17. 18. 19. 20. 21. 22. 23. and state of emergency declarations on the relative inci-
dence of legionellosis and invasive pneumococcal disease
in Japan. J Infect Chemother. 2023 Jan;29(1), 90-4. doi:
10.1016/j.jiac.2022.08.016. Epub 2022 Sep 16. PMID:
36116719.
Borella P, Montagna MT, Stampi S, Stancanelli G, Romano-
Spica V, Triassi M, et al. Legionella contamination in hot
water of Italian hotels. Appl Environ Microbiol. 2005
Oct;71(10):5805-13. doi: 10.1128/AEM.71.10.5805-
5813.2005. PMID: 16204491.
Kyritsi MA, Mouchtouri VA, Katsioulis A, Kostara E,
Nakoulas V, Hatzinikou M, et al. Legionella Colonization
of Hotel Water Systems in Touristic Places of Greece: As-
sociation with System Characteristics and Physicochemical
Parameters. Int J Environ Res Public Health. 2018 Nov 30;
15(12):2707. doi: https://doi.org/10.3390/ijerph15122707.
PMID: 30513698.
D’Alò GL, Messina A, Mozzetti C, Cicciarella Modica D,
De Filippis P. Competitive colonization of Legionella and
Pseudomonas aeruginosa in water systems of residential
facilities hosting closed communities Legionella versus
Pseudomonas aeruginosa in water systems of residential
facilities. Ig Sanita Pubbl. 2022 Mar-Apr; 79(2):92-110.
De Giglio O, Diella G, Lopuzzo M, Triggiano F, Calia C,
Pousis C, et al. Impact of lockdown on the microbiological
status of the hospital water network during COVID-19
pandemic. Environ Res. 2020 Dec;191:110231. doi:
10.1016/j.envres.2020.110231. Epub 2020 Sep 23. PMID:
32976823.
Gamage SD, Jinadatha C, Coppin JD, Kralovic SM, Bender
A, Ambrose M, et al. Factors That Affect Legionella Posi-
tivity in Healthcare Building Water Systems from a Large,
National Environmental Surveillance Initiative. Environ Sci
Technol. 2022 Aug 16;56(16):11363-11373. doi: 10.1021/
acs.est.2c02194. Epub 2022 Aug 5. PMID: 35929739
Federigi I, De Giglio O, Diella G, Triggiano F, Apollonio
F, D’Ambrosio M, et al. Quantitative Microbial Risk As-
sessment Applied to Legionella Contamination on Long-
Distance Public Transport. Int J Environ Res Public Health.
2022 Feb 10;19(4):1960. doi: 10.3390/ijerph19041960.
PMID: 35206148.
De Giglio O, Napoli C, Diella G, Fasano F, Lopuzzo M,
Apollonio F, et al. Integrated approach for legionellosis
risk analysis in touristic-recreational facilities. Environ Res.
2021 Nov; 202:111649. doi: 10.1016/j.envres.2021.111649.
Epub 2021 Jul 9. PMID: 34252427.
Nagy DJ, Dziewulski DM, Codru N, Lauper UL. Under-
standing the distribution of positive Legionella samples in
healthcare-premise water systems: Using statistical analysis
to determine a distribution for Legionella and to support
sample size recommendations. Infect Control Hosp Epi-
demiol. 2021 Jan;42(1):63-68. doi: 10.1017/ice.2020.384.
Epub 2020 Oct 8. PMID: 33028429.
Fasano F, Addante AS, Valenzano B, Scannicchio G. Varia-
bles Influencing per Capita Production, Separate Collection,
and Costs of Municipal Solid Waste in the Apulia Region
(Italy): An Experience of Deep Learning. Int J Environ
O. De Giglio et al.
Res Public Health. 2021 Jan 17;18(2):752. doi: 10.3390/
ijerph18020752. PMID: 33477308.
24. Alom MZ, Taha TM, Yakopcic C, Westberg S, Sidike P,
Nasrin MS, et al. State-of-the-Art Survey on Deep Learning
Theory and Architectures. Electronics. 2019; 8(3):292. doi:
https://doi.org/10.3390/electronics8030292.
25. Brunello A, Civilini M, De Martin S, Saccomanno M, Vi-
tacolonna N. Machine learning-assisted environmental sur-
veillance of Legionella: A retrospective observational study
in Friuli-Venezia Giulia region of Italy in the period 2002–
2019. Informatics in Medicine Unlocked.2022; 28:100803.
doi: https://doi.org/10.1016/j.imu.2021.100803.
26. Tata A, Marzoli F, Cordovana M, Zacometti C, Massaro A,
Barco L, et al. A multi-center validation study on the discri-
mination of Legionella pneumophila sg.1, Legionella pneu-
mophila sg. 2-15 and Legionella non-pneumophila isolates
from water by FT-IR spectroscopy. Front Microbiol. 2023
Apr 13;14:1150942. doi: 10.3389/fmicb.2023.1150942.
PMID: 37125166.
27. Sinčak P, Ondo J, Kaposztasova D, Virčikova M, Vranayova
Z, Sabol J. Artificial intelligence in public health prevention
of legionelosis in drinking water systems. Int J Environ Res
Public Health. 2014 Aug 21;11(8):8597-611. doi: 10.3390/
ijerph110808597. PMID: 25153475.
28. Russell S, Norvig P. Artificial Intelligence: A Modern Ap-
proach. Global Edition; 2021.
29. Soori M, Arezoo B, Dastres R. Artificial intelligence, ma-
chine learning and deep learning in advanced robotics, a
review. Cognitive Robotics. 202; 3:54-70. doi: https://doi.
org/10.1016/j.cogr.2023.04.001.
30. Sharma N, Sharma R, Jindal N. Machine Learning and
Deep Learning Applications-A Vision. Global Transitions
Proceedings. 2021;2(1):24-28. doi:https://doi.org/10.1016/j.
gltp.2021.01.004.
31. Guidelines for the Prevention and Control of Legionellosis,
2015. Available from: http://www.salute.gov.it/imgs/C_17_
pubblicazioni_2362_allegato.pdf. [Last accessed: 2024 May
20].
32. ISO 11731:2017. Water Quality—Enumeration of Legionel-
la; International Organization for Standardization: Geneva,
Switzerland; 2017.
33. Civil Protection Department Apulia Region. Available from:
https://protezionecivile.puglia.it/bollettini-meteorologici-
regionali-mensili. [Last accessed: 2024 May 20].
34. Potdar K, Pardawala TS, Pai CDA. Comparative Study of
Categorical Variable Encoding Techniques for Neural Net-
work Classifiers. Int. J. Comput Appl. 2017;175:7-9. doi:
10.5120/ijca2017915495.
35. Patro S, Sahu KK. Normalization: A Preprocessing
Stage. IARJSET. 2015;2(3):20-22. doi: 10.5120/
ijca2017915495.
36. Xu Y, Goodacre R. On Splitting Training and Validation
Set: A Comparative Study of Cross-Validation, Bootstrap
and Systematic Sampling for Estimating the Generaliza-
tion Performance of Supervised Learning. J Anal Test.
2018;2(3):249-262. doi: 10.1007/s41664-018-0068-2. Epub
2018 Oct 29. PMID: 30842888.
Machine learning model to predict Legionella contamination
37. Dobbin KK, Simon RM. Optimally splitting cases for trai-
ning and testing high dimensional classifiers. BMC Med
Genomics. 2011 Apr 8;4:31. doi: 10.1186/1755-8794-4-31.
PMID: 21477282.
38. Kufel J, Bargieł-Łączek K, Kocot S, Koźlik M, Bartnikowska
W, Janik M, et al. What Is Machine Learning, Artificial
Neural Networks and Deep Learning?-Examples of Practical
Applications in Medicine. Diagnostics (Basel). 2023 Aug
3;13(15):2582. doi: 10.3390/diagnostics13152582. PMID:
37568945.
39. Jürgen Schmidhuber. Deep learning in neural networks: An
overview. Neural Networks. 2015;61:85-117. https://doi.
org/10.1016/j.neunet.2014.09.003.
40. Stegemann J, Buenfeld N. A Glossary of Basic Neural
Network Terminology for Regression Problems. Neural
Comput. & Applic. 1999; 8:290–6. https://doi.org/10.1007/
s005210050034.
41. Xu C, Coen-Pirani P, Jiang X. Empirical Study of Over-
fitting in Deep Learning for Predicting Breast Cancer
Metastasis. Cancers. 2023;15:1969. https://doi.org/10.3390/
cancers15071969.
42. Bengio Y, Courville A, Vincent P. Representation Learning:
A Review and New Perspectives. IEEE Transact Pattern
Anal Machine Intell. 2013;35:1798-1828. doi: 10.1109/
TPAMI.2013.50.
43. Deng L,Yu D. Deep Learning: Methods and Applica-
tions. Found. Trends Signal Process 2014;7:197-387. doi:
10.1561/2000000039.
44. Greenwell BM, Boehmke BC. Variable Importance Plots-An
Introduction to the vip Package. R Journal 2020;12(1):343-
366. https://doi.org/10.32614/RJ-2020-013.
45. Favorskaya MN, Andreev VV. The study of activation
functions in deep learning for pedestrian detection and
tracking. Int Arch Photogramm Remote Sens Spat Inf. Sci
2019; XLII-2/W12:53-9. doi: 10.5194/isprs-archives-XLII-
2-W12-53-2019.
46. Eckle K, Shmidt-Hieber J. A comparison of deep networks
with ReLU activation function and linear spline-type me-
thods. Neural Netw. 2019;110:232–242. doi: 10.1016/j.
neunet.2018.11.005.
47. Huang F, Zhang J, Zhou C, Wang Y, Huang J,Zhu L. A deep
learning algorithm using a fully connected sparse autoenco-
der neural network for landslide susceptibility prediction.
Landslides. 2020;17:217-229. doi: 10.1007/s10346-019-
01274-9.
48. De Giglio O, Fasano F, Diella G, Lopuzzo M, Napoli C,
Apollonio F, et al. Legionella and legionellosis in touristic-
recreational facilities: Influence of climate factors and geo-
statistical analysis in Southern Italy (2001-2017). Environ
Res. 2019;178:108721. doi: 10.1016/j.envres.2019.108721.
Epub 2019 Sep 6. PMID: 31541805.
49. Conza L, Casati Pagani S, Gaia V. Influence of climate and
geography on the occurrence of Legionella and amoebae in
composting facilities. BMC Res Notes. 2014 Nov 24;7:831.
doi: 10.1186/1756-0500-7-831. PMID: 25421541.
50. Cui Y, Kim DY, Zhu J. On the generalized poisson regres-
sion mixture model for mapping quantitative trait loci
139
with count data. Genetics. 2006 Dec;174(4):2159-72. doi:
10.1534/genetics.106.061960. Epub 2006 Oct 8. PMID:
17028335.
51. Nguyen QH, Ly HB, Ho LS, Al-Ansari N, Le HV, Tran
VQ, et al. Influence of Data Splitting on Performance of
Machine Learning Models in Prediction of Shear Strength
of Soil. Mathematical Problems in Engineering. 2021:1-15.
doi: 10.1155/2021/4832864.
52. Singh P, Singh N, Singh KK, Singh A. Chapter 5 - Dia-
gnosing of disease using machine learning. In: Singh KK,
Elhoseny M, Singh A, Elngar AA, Eds. Machine Learning
and the Internet of Medical Things in Healthcare. Academic
Press; 2021:89-111. doi: https://doi.org/10.1016/B978-0-12-
821229-5.00003-3.
53. Wilson AM, Canter K, Abney SE, Gerba CP, Myers ER,
Hanlin J, et al. An application for relating Legionella
shower water monitoring results to estimated health
outcomes. Water Res. 2022 Aug 1;221:118812. doi:
10.1016/j.watres.2022.118812. Epub 2022 Jul 3. PMID:
35816914.
54. Marchesi I, Paduano S, Frezza G, Sircana L, Vecchi E, Zuc-
carello P, et al. Safety and Effectiveness of Monochloramine
Treatment for Disinfecting Hospital Water Networks. Int J
Environ Res Public Health. 2020 Aug 22;17(17):6116. doi:
10.3390/ijerph17176116. PMID: 32842654.
55. Papadakis A, Keramarou M, Chochlakis D, Sandalakis V,
Mouchtouri VA, Psaroulaki A. Legionella spp. Colonization
in Water Systems of Hotels Linked with Travel-Associated
Legionnaires’ Disease. Water. 2021;13(16):2243. https://
doi.org/10.3390/w13162243.
56. Arvand M, Jungkind K, Hack A. Contamination of the cold
water distribution system of health care facilities by Legio-
nella pneumophila: do we know the true dimension? Euro
Surveill. 2011 Apr 21;16(16):19844. PMID: 21527132.
57. Stout JE, Yu VL, Muraca P. Isolation of Legionella
pneumophila from the cold water of hospital ice ma-
chines: implications for origin and transmission of the
organism. Infect Control. 1985;6(4):141-6. doi: 10.1017/
s0195941700062937. PMID: 3886578.
58. Istituto Superiore di Sanità 2020. Rapporto COVID-19, n.
21/2020. Guida per la prevenzione della contaminazione da
Legionella negli impianti idrici di strutture turistico recet-
tive, e altri edifici ad uso civile e industriale non utilizzati
durante la pandemia COVID-19.
59. Sheffer PJ, Stout JE, Wagener MM, Muder RR. Efficacy
of new point-of-use water filter for preventing exposure to
Legionella and waterborne bacteria. Am J Infect Control.
2005;33(5 Suppl 1):S20-5. doi: 10.1016/j.ajic.2005.03.012.
PMID: 15940113.
60. Walker JT. The influence of climate change on waterborne di-
sease and Legionella: a review. Perspect Public Health. 2018
Sep;138(5):282-286. doi: 10.1177/1757913918791198.
PMID: 30156484.
61. Fragou K, Kokkinos P, Gogos C, Alamanos Y, Vantarakis A.
Prevalence of Legionella spp. in water systems of hospitals
and hotels in South Western Greece. Int J Environ Health Res.
2012;22(4):340-54. doi: 10.1080/09603123.2011.643229.
140 O. De Giglio et al.
62. 63. 64. 65. 66. Epub 2011 Dec 12. PMID: 22149148.
Montagna MT, Brigida S, Fasano F, Leone CM, D’Ambro-
sio M, Spagnuolo V, et al. The role of air temperature in
Legionella water contamination and legionellosis incidence
rates in southern Italy (2018-2023). Ann Ig. 2023 Nov-
Dec;35(6):631-640. doi: 10.7416/ai.2023.2578. Epub 2023
Sep 20. PMID: 37724578.
Dupke S, Buchholz U, Fastner J, Förster C, Frank C, Lewin
A, et al. Impact of climate change on waterborne infections
and intoxications. J Health Monit. 2023 Jun 1;8(Suppl
3):62-77. doi: 10.25646/11402. PMID: 37342430; PMCID:
PMC10278370.
Pavissich JP, Aybar M, Martin KJ, Nerenberg R. A methodo-
logy to assess the effects of biofilm roughness on substrate
fluxes using image analysis, substrate profiling, and mathe-
matical modelling. Water Sci Technol. 2014;69(9):1932-41.
doi: 10.2166/wst.2014.103. PMID: 24804670.
Tierra G, Pavissich JP, Nerenberg R, Xu Z, Alber MS.
Multicomponent model of deformation and detachment
of a biofilm under fluid flow. J R Soc Interface. 2015 May
6;12(106):20150045. doi: 10.1098/rsif.2015.0045. PMID:
25808342.
Liu J, Chen H, Yao L, Wei Z, Lou L, Shan Y, et al.The
spatial distribution of pollutants in pipe-scale of large-
67. 68. 69. 70. 71. diameter pipelines in a drinking water distribution system.
J Hazard Mater. 2016 Nov 5; 317:27-35. doi: 10.1016/j.
jhazmat.2016.05.048. Epub 2016 May 17. PMID:
27244696.
Shen Y, Monroy GL, Derlon N, Janjaroen D, Huang C,
Morgenroth E, et al. Role of biofilm roughness and hydro-
dynamic conditions in Legionella pneumophila adhesion
to and detachment from simulated drinking water biofilms.
Environ Sci Technol. 2015;49(7):4274-82. doi: 10.1021/
es505842v. Epub 2015 Mar 11. PMID: 25699403.
Lin H, Zhu X, Wang Y, Yu X. Effect of sodium hypochlorite
on typical biofilms formed in drinking water distribution
systems. J Water Health. 2017;15(2):218-227. doi: 10.2166/
wh.2017.141. PMID: 28362303.
Hordri NF, Samar A, Yuhaniz SS, Shamsuddin SM. A syste-
matic literature review on features of deep learning in big
data analytics. Int J Adv Soft Comput Appl. 2017;9(1):32-
49.
Vadera S, Ameen S. Methods for Pruning Deep Neural
Networks. IEEE Access. 2022;10:63280-63300.
Ma YD, Zhao ZC, Liu D, He Z, Zhou W. OCAP: On-device
Class-Aware Pruning for personalized edge DNN models.
J Syst Architect. 2023;142:102956.
Published
Issue
Section
License
Copyright (c) 2025 Osvalda De Giglio, Fabrizio Fasano, Giusy Diella, Valentina Spagnuolo, Francesco Triggiano, Marco Lopuzzo, Francesca Apollonio, Carla Maria Leone, Maria Teresa Montagna (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Transfer of Copyright and Permission to Reproduce Parts of Published Papers.
Authors retain the copyright for their published work. No formal permission will be required to reproduce parts (tables or illustrations) of published papers, provided the source is quoted appropriately and reproduction has no commercial intent. Reproductions with commercial intent will require written permission and payment of royalties.