DOI: 10.12688/wellcomeopenres.26477.1 ISSN: 2398-502X

Surveillance-Driven Machine Learning for Prediction of Antimicrobial Susceptibility: An Explainable Modeling Framework using the Pfizer ATLAS Dataset (2004 - 2023)

Raphael Mutua, David Gichohi, Samuel K. Ndegwa, Rahma O. Golicha, Frida Njeru, Benson Kituku
Background Antimicrobial resistance (AMR) is currently a major global health problem, with Sub-Saharan Africa bearing the brunt of the burden. This heavy burden of AMR is mainly due to poor diagnostic infrastructure and limited Antimicrobial Stewardship (AMS) programs. Timely pathogen detection and Antimicrobial Susceptibility Testing (AST) are crucial for improving patient health outcomes. Conventional culture-based AST methods take 48–72 hours, raising the need for immediate empirical antibiotic therapy. This calls for the development of rapid and easy-to-use techniques for AMR prediction and monitoring for routine and epidemiological surveillance. Methods We developed surveillance-driven Machine Learning (ML) models using the Pfizer Antimicrobial Testing Leadership and Surveillance (ATLAS) dataset (2004–2023). The study integrated a rule-based Minimum Inhibitory Concentration (MIC) interpretive tool with antibiotic-specific predictive models. It also incorporated an interactive dashboard for data visualization and decision support. Model performance was evaluated using training and testing accuracy, alongside robustness assessments. Results The predictive models demonstrated testing accuracies ranging between 57% and 88% across antibiotics. Bacterial families and species were the best predictors of antibiotic susceptibility, with geographic location, specimen source, and clinical specialty also contributing significantly. In contrast, demographic variables showed comparatively lower predictive value. Conclusion Routinely collected AMR surveillance data can be leveraged to develop interpretive and scalable predictive models. Such models are more compatible with existing surveillance infrastructure and may be integrated into clinical workflows with relatively minimal additional resource requirements.

More from our Archive