Artificial Intelligence in Occupational Health Surveillance: Evaluating AI-Assisted ILO Classification of Radiographs of Pneumoconioses

Antonio Baldassarre; Martina Padovan; Alessandro Palla; Augusto Quercia; Rita Leonori; Stefano Dugheri; Nicola Mucci; Veronica Traversini

doi:10.23749/mdl.2026.18371

Authors

Antonio Baldassarre Occupational Medicine, Department of Experimental and Clinical Medicine, University of Florence, Florence, Italy https://orcid.org/0000-0002-6124-3570
Martina Padovan Preventive Medicine, Tuscany North-West Health Local Unit, Italy
Alessandro Palla Intel Corporation, Santa Clara, USA
Augusto Quercia (Former chief) Workplace Prevention and Safety Unit, Viterbo Health Local Unit, Italy
Rita Leonori Workplace Prevention and Safety Unit, Viterbo Health Local Unit, Italy
Stefano Dugheri Department of Life Science, Health, and Health Professions, Link Campus University, Rome, Italy
Nicola Mucci Occupatioanl Medicine, Department of Experimental and Clinical Medicine, University of Florence, Florence, Italy
Veronica Traversini Occupatioanl Medicine, Department of Experimental and Clinical Medicine, University of Florence, Florence, Italy

Keywords:

ILO International Classification of Radiographs of Pneumoconioses, Artificial Intelligence, Occupational Medicine

Abstract

Background: Pneumoconioses remain an important occupational health issue, particularly in low- and middle-income countries. The International Labour Organization (ILO) Classification standardizes chest radiograph interpretation but requires trained readers and is affected by inter-reader variability. This study evaluated whether generative multimodal artificial intelligence (AI) models can approximate ILO-based diagnostic reasoning. Methods: Eighty-two chest radiographs from the official NIOSH B Reader syllabus were analysed using four AI systems (GPT-4o, GPT-5, MedGemma-4B, MedGemma-27B). Each image was evaluated with a standardized prompt based on the 2022 revised ILO guidelines using deterministic settings. Model outputs were mapped to ILO codes and compared with the official answer keys of the ILO Standard Radiograph Set used for B Reader training and examination. Performance metrics included balanced accuracy, sensitivity, specificity, precision, and Matthews correlation coefficient (MCC). Bootstrap 95% confidence intervals, McNemar’s test, and Cohen’s κ assessed performance variability and agreement. Results: All four AI models showed moderate diagnostic performance, with balanced accuracy ranging from 60.8% to 70.3%. Sensitivity remained limited (35.5%–54.9%), while specificity was consistently high (84.6%–86.2%). MedGemma-27B performed best for small opacities, GPT-5 for pleural abnormalities and for technical quality. Large opacities and rare findings were systematically under-detected. Statistical comparisons showed significant differences between models, although agreement patterns were broadly similar. Conclusion: All AI models partially followed structured ILO radiographic criteria but did not achieve expert-level performance, confirming that they cannot replace certified B Readers. Larger, real-world datasets are needed to assess their potential clinical utility as supportive tools in occupational health surveillance programs.

References

1. Hou X, Wei Z, Jiang X, et al. A comprehensive retrospect on the current perspectives and future prospects of pneumoconiosis. Front Public Health. 2025;12:1435840.

2. Naghavi M,, Hmwe Hmwe K ,Bhoomadevi A, et al. Global burden of 292 causes of death in 204 countries and territories and 660 subnational locations, 1990–2023: a systematic analysis for the Global Burden of Disease Study 2023. The Lancet. 2025, vol. 406.

3. Zhang JS, Xiong X, Ruan, et al. Global burden of pneumoconiosis from 1990 to 2021: a comprehensive analysis of incidence, mortality, and socio-demographic inequalities in 204 countries and territories. Front Public Health. 2025;13:1579851.

4. Matyga AW, Chelala L, Chung JH. Occupational Lung Diseases: Spectrum of Common Imaging Manifestations. Korean J Radiol. 2023;24:8.

5. Guidelines for the use of the ILO International Classification of Radiographs of Pneumoconioses Revised edition 2022. Available online at: https://www.ilo.org/resource/ilo-international-classification-radiographs-pneumoconioses-1 (Last Accessed 3-10-25).

6. Morgan RH. Proficiency examination of physicians for classifying pneumoconiosis chest films. AJR Am J Roentgenol. 1979;132(5).

7. Halldin, CN, Hale JM, Weissman, et al. The National Institute for Occupational Safety and Health B Reader Certification Program – An Update Report (1987 to 2018) and Future Directions. J Occup Environ Med. 2019;61(12):1045-1051.

8. Lewis MA. Charles Babbage: Reclaiming an operations management pioneer. J Oper Manag. 2007;25(2):248-259.

9. Grattan-Guinness I. Charles Babbage as an Algorithmic Thinker. IEEE Ann Hist Comput. 1992;14(3):34-48.

10. Strawn G. Masterminds of Punched Card Data Processing: Herman Hollerith and John Billings. IT Prof. 2023;25(6):90-93

11. Ziv L, Nakash M. Behind the Algorithm: International Insights into Data-Driven AI Model Development. Mach Learn Knowl Extr. 2025;7(4):122.

12. Y. Hosoda. ILO international classifications of radiographs of pneumoconioses – Past, present and future. International Classification of HRCT for Occupational and Environmental Respiratory Diseases, 2005.

13. Muszyńska-Graca M, Dąbkowska B, Brewczyński, PZ. Guidelines for the use of the International Classification of Radiographs of Pneumoconioses of the International Labour Office (ILO): Substantial changes in the currrent edition. Med Pr. 2016;67(6):833-837.

14. Buess L, Keicher M, Navab N, et al. From large language models to multimodal AI: A scoping review on the potential of generative AI in medicine. Biomed Eng. 2025;15:845–863.

15. Nam Y, Kim DY, Kyung S, et al. Multimodal Large Language Models in Medical Imaging: Current State and Future Directions. Korean J Radiol. 2025;26(10):900-923.

16. Soni N, Ora M, Agarwal A, et al. A Review of the Opportunities and Challenges with Large Language Models in Radiology: The Road Ahead. AJNR Am . Neuroradiol, 2025;46(7):1292–1299.

17. Sun W, Wu D, Luo Y, et al. A Fully Deep Learning Paradigm for Pneumoconiosis Staging on Chest Radiographs. IEEE J Biomed Health Inform. 2022;26(10):5154-5164.

18. Zheng R, Deng K, Jin H, et al. An improved CNN-based pneumoconiosis diagnosis method on X-ray chest film. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2019.

19. Zhang L, Rong R, Li Q, et al. A deep learning-based model for screening and staging pneumoconiosis. Sci Rep. 11,2201(2021).

20. Zhang Y, Zheng B, Zeng F, et al. Potential of digital chest radiography-based deep learning in screening and diagnosing pneumoconiosis: An observational study. Medicine. 2024;103(25):e38478.

21. Yang F, Tang ZR, Chen J, et al. Pneumoconiosis computer aided diagnosis system based on X-rays and deep learning. BMC Med Imaging. 2021;(1):2201.

22. Hanampa V, Astete J, Castaneda B , Romero S. Diagnosis of Pneumoconiosis with Machine Learning. in 2024 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2024.

23. Li X, Liu CF, Guan L, et al. Deep Learning in Chest Radiography: Detection of Pneumoconiosis. Biomed Enviro Sci. 2021;34(10):842-845.

24. Song M, Wang J, Yu Z, et al. PneumoLLM: Harnessing the power of large language model for pneumoconiosis diagnosis. Med Image Anal. 2024;97:103248.

25. Tian D, Jiang S, Zhang L, Lu X, Xu Y. The role of large language models in medical image processing: a narrative review. Quant Imaging Med Surg. 2024,14(1), 1108–1121.

26. Akinci D'Antonoli T, Stanzione A, Bluethgen C, et al. Large language models in radiology: fundamentals, applications, ethical considerations, risks, and future directions. Diagn Interv Radiol. 2024;30(2):80-90.

27. Lanzafame LRM, Gulli C, Mazziotti S, et al. Chatbots in Radiology: Current Applications, Limitations and Future Directions of ChatGPT in Medical Imaging. Diagnostics. 2025; 15(13):1635.

28. Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need, Advances in Neural Information Processing Systems, 2017.

29. OpenAI, Achiam J, Adler S, et al. GPT-4 Technical Report. 2024 Available online at: http://arxiv.org/abs/2303.08774 (Last Accessed on 20-10-25).

30. Vilakati S. Prompt engineering for accurate statistical reasoning with large language models in medical research. Front Artif Intell. 2025;8:1658316.

31. Devnath L, Fan Z, Luo S, et al. Detection and Visualisation of Pneumoconiosis Using an Ensemble of Multi-Dimensional Deep Features Learned from Chest X-rays.Int J Environ Res Public Health. 2022 ;19(18):11193.

32. Holzmann H, Klar B. Robust performance metrics for imbalanced classification problems. arXiv preprint. 2024,2404.07661.

33. Brodersen KH, Ong CS, Stephan KE, Buhmann JM. The Balanced Accuracy and Its Posterior Distribution. 20th International Conference on Pattern Recognition, 2010, pp. 3121–3124.

34. Li X, Xu M, Yan Z, et al. Deep convolutional network-based chest radiographs screening model for pneumoconiosis. Front Med. 2024:11:1290729.

35. Alam MS, Wang D, Sowmya A. DLA-Net: dual lesion attention network for classification of pneumoconiosis using chest X-ray images. Sci Rep, 2024;14:11616.

36. Okumura E, Kawashita I, Ishida T. Computerized Classification of Pneumoconiosis on Digital Chest Radiography Artificial Neural Network with Three Stages. J Digit Imaging, 2017;30(4):413-426.

37. Akhter Y, Ranjan R, Singh R, et al. On AI-Assisted Pneumoconiosis Detection from Chest X-rays. IJCAI International Joint Conference on Artificial Intelligence, 2023.

38. Halldin CN, Blackley DJ. Petsonk EL, Laney AS. Pneumoconioses radiographs in a large population of U.S. coal workers: Variability in a reader and B Reader classifications by using the international labour office classification. Radiology,2017;284(3):870–876.

39. Leonori R, Cardona E, Napoli G. Risultati di un’esperienza di formazione sulle linee guida ilo per le pneumoconiosi, Giornale Italiano Di Medicina Del Lavoro Ed Ergonomia ,2025, vol. XLVII.

Artificial Intelligence in Occupational Health Surveillance: Evaluating AI-Assisted ILO Classification of Radiographs of Pneumoconioses

Authors

Keywords:

Abstract

References

Downloads

Issue

Section

License

How to Cite