DOI: 10.3390/toxics14070586 ISSN: 2305-6304

Building Comprehensive Toxicity Data Libraries of Short-Chain Length PHA-Based Materials for the Development of Machine Learning-Based Predictive Tools

Konstantina V. Filippou, Alexandros Angelis, Nikolaos P. Sotiropoulos, Marianna I. Kotzabasaki, Haralambos Sarimveis, Chrysanthos Maraveas

Polyhydroxyalkanoates (PHAs) have emerged as a promising alternative to conventional plastics due to their biodegradable and generally favorable biocompatible profile, allowing their application in medical fields, such as drug delivery systems and surgical implants. However, the toxicity assessment of these materials is complex, time-consuming, and costly. Currently, toxicity data for PHAs are limited, dispersed across various studies, and insufficiently reported, which hinders comparative analysis and the development of predictive models. In response to these challenges, recent developments in predictive toxicology have incorporated machine learning-based approaches to estimate toxicological endpoints while reducing the reliance on in vivo experimentation. The present study aims to construct comprehensive, standardized data libraries for the cytotoxicity and ecotoxicity of PHAs and to develop and evaluate polymer-specific machine learning models that link polymer composition to toxicological outcomes. Several computational workflows were designed for this research, with Extra Trees Classifier and Gradient Boosting Classifier being the primary predictive algorithms. Furthermore, Shapley Additive Explanations (SHAP) analysis was performed to identify the descriptors that most strongly influence the predicted cytotoxicity and ecotoxicity. The cytotoxicity model achieved a test Matthews Correlation Coefficient (MCC) of 0.678 and a balanced accuracy of 0.901, with additive type, exposure conditions and particle morphology identified as the most influential descriptors. The ecotoxicity model reached a test MCC of 0.639 and balance accuracy of 0.818 within its applicability domain, with organism- and exposure-level descriptors dominating the predictions. Polymer composition contributed comparatively little, supporting the established biocompatibility of bulk PHB and PHBV. These results, however, should be interpreted in light of the modest dataset size, experimental protocols heterogeneity, and lack of independent experimental validation.

More from our Archive